Introduction
Introduction Statistics Contact Development Disclaimer Help
article-seirdy-an-experiment-to-test-github-copilot-s-legality.mw - tgtimes - T…
git clone git://bitreich.org/tgtimes git://enlrupgkhuxnvlhsf6lc3fziv5h2hhfrinws…
Log
Files
Refs
Tags
README
---
article-seirdy-an-experiment-to-test-github-copilot-s-legality.mw (11221B)
---
1 .SH seirdy
2 An experiment to test GitHub Copilot's legality
3 .2C 157v
4 .
5 .QP
6 This article was posted on 2022-07-01 by Rohan Kumar
7 .FS
8 https://seirdy.one/posts/2022/07/01/experiment-copilot-legality/
9 gemini://seirdy.one/posts/2022/07/01/experiment-copilot-legality/index.g…
10 .FE
11 and is now republished on this newspaper, with permission (CC-BY-SA 4.0).
12 .
13 .
14 .IP "Preface"
15 .
16 .PP
17 I am not a lawyer.
18 This post is satirical commentary on:
19 .
20 .IP \(bu
21 The absurdity of Microsoft and OpenAI's legal justification for GitHub C…
22 .
23 .IP \(bu
24 The oversimplifications people use to argue against GitHub Copilot (I do…
25 .
26 .IP \(bu
27 The relationship between capital and legal outcomes.
28 .
29 .IP \(bu
30 How civil cases seem like sporting events where people “win” or “l…
31 .
32 .PP
33 In the process, I intentionally misrepresent how the judicial system wor…
34 I portray the system the way people like to imagine it works.
35 Please don't make any important legal decisions based on anything I say.
36 .
37 .PP
38 The only section you should take seriously is “Context:
39 the relevant technologies”.
40 .
41 .
42 .IP "Introduction"
43 .
44 .PP
45 GitHub is enabling copyleft violation \fBat scale\fR with Copilot.
46 GitHub Copilot encourages people to make derivative works of source code…
47 This facilitates the creation of permissively-licensed or proprietary de…
48 .
49 .PP
50 Unfortunately, challenging Microsoft (GitHub's parent company) in court …
51 their legal budget probably ensures their victory, and they likely alrea…
52 How can we determine Copilot's legality on a level playing field? We can…
53 .
54 .PP
55 A chat with Matt Campbell about a speech synthesizer gave me a horrible …
56 I think I know a way to find out if GitHub Copilot is legal:
57 we could use its legal justification against another software project wi…
58 Specifically, against a speech synthesizer.
59 The outcome of our actions could set a legal precedent to determine the …
60 .
61 .PP
62 Context: the relevant technologies
63 Let's cover the technologies and actors at play before I start my evil m…
64 .
65 .
66 .IP "Exhibit A: GitHub Copilot"
67 .
68 .PP
69 GitHub Copilot is a predictive autocompletion service for writing softwa…
70 It's powered by OpenAI Codex,
71 .FS
72 https://openai.com/blog/openai-codex/
73 .FE
74 a language model based on GPT-3.
75 .FS
76 https://en.wikipedia.org/wiki/GPT-3
77 .FE
78 It was trained using the source code of public repositories hosted on Gi…
79 In response to a Request for Comments from the US Patent and Trademark O…
80 .FS
81 See Comment Regarding Request for Comments on Intellectual Property Prot…
82 for Artificial Intelligence Innovation submitted by OpenAI to the USPTO.
83 https://www.uspto.gov/sites/default/files/documents/OpenAI_RFC-84-FR-581…
84 .FE
85 .
86 .PP
87 Many of the code snippets it suggests are exact copies of source code fr…
88 For an example, see this tweet:
89 I don't want to say anything but that's not the right license Mr Copilot.
90 .FS
91 https://nitter.net/mitsuhiko/status/1410886329924194309
92 https://twitter.com/mitsuhiko/status/1410886329924194309
93 .FE
94 by Armin Ronacher
95 .FS
96 https://lucumr.pocoo.org/about/
97 .FE
98 It contains a screen recording of Copilot suggesting this Quake code.
99 .FS
100 https://github.com/id-Software/Quake-III-Arena/blob/master/code/game/q_m…
101 At line 552
102 .FE
103 When prompted to do so, it obediently fills in a permissive license.
104 That permissive license violates the Quake code's GPL-2.0 license.
105 Copilot provides no indication that a license violation is taking place.
106 .
107 .PP
108 GitHub performed its own research into the matter.
109 .FS
110 I doubt anybody worth their salt would count on a company to hold itself
111 accountable, but at least they tried.
112 .FE
113 You can read about it on their blog:
114 GitHub Copilot research recitation,
115 .FS
116 https://github.blog/2021-06-30-github-copilot-research-recitation/
117 .FE
118 by Albert Ziegler.
119 .FS
120 https://github.com/wunderalbert
121 .FE
122 I'm not convinced that it accounts for the fact that suggested code migh…
123 .
124 .
125 .IP "Exhibit B: The Eloquence speech synthesizer"
126 .
127 .PP
128 I recently had a chat with Matt on IRC about screen readers and differen…
129 I mentioned that while I do like some variety, I always find myself retu…
130 .FS
131 https://github.com/espeak-ng/espeak-ng/
132 .FE
133 He shared some of my fondness, and also shared his preference for a simi…
134 .
135 .PP
136 Downloads of Eloquence are easy to find (it's even included with the JAW…
137 Nuance acquired Eloquent Technology, the developer of Eloquence.
138 Microsoft later acquired Nuance.
139 .
140 .
141 .IP "Eloquence sample audio"
142 .
143 .PP
144 Matt recorded this sample audio clip of Eloquence reading some text.
145 .FS
146 https://seirdy.one/a/eloquence.mp3
147 .FE
148 The text is from the introduction of Best practices for inclusive textua…
149 .FS
150 https://seirdy.one/posts/2020/11/23/website-best-practices/
151 .FE
152 .
153 .QP
154 My primary focus is inclusive design.
155 Specifically, I focus on supporting underrepresented ways to read a page.
156 Not all users load a page in a common web-browser and navigate effortles…
157 Authors often neglect people who read through accessibility tools, tiny …
158 I list more niches in the conclusion.
159 Compatibility with so many niches sounds far more daunting than it reall…
160 if you only selectively override browser defaults and use plain-old, sem…
161 .
162 .PP
163 I like the Eloquence speech synthesizer.
164 It sounds similar to the robotic yet predictable voice of my beloved eSp…
165 Unfortunately, Eloquence is proprietary.
166 .
167 .
168 .IP "Exhibit C: Deep learning speech synthesis"
169 .
170 .PP
171 Deep learning speech synthesis
172 .FS
173 https://en.wikipedia.org/wiki/Deep_learning_speech_synthesis
174 .FE
175 is a recent approach to speech synthesizer creation.
176 It involves training a deep neural network on voice samples, and using t…
177 One synthesizer using deep learning speech synthesis is Mozilla's TTS.
178 .FS
179 https://github.com/mozilla/TTS
180 .FE
181 .
182 .PP
183 Zero-shot approaches could allow a pre-trained model to generate multipl…
184 YourTTS
185 .FS
186 https://doi.org/10.48550/arXiv.2112.02418
187 .FE
188 is one such example.
189 This could allow us to synthetically re-create a person's voice more eas…
190 .
191 .
192 .IP "My horrible plan"
193 .
194 .PP
195 My horrible plan revolves around going through two different lawsuits to…
196 .
197 .PP
198 If this succeeds, we have new legal justification that GitHub Copilot is…
199 It's a win-win situation.
200 .
201 .
202 .IP "Part One: set a precedent"
203 .
204 .IP 1.
205 Train a modern text-to-speech (TTS) engine using the voice a proprietary…
206 Keep the model's internals hidden.
207 .
208 .IP 2.
209 Then release the final TTS under a permissive license.
210 Remember, we're still keeping the machine-learning model hidden!
211 .
212 .IP 3.
213 Wait for that company to file suit.
214 .FS
215 If the stars align, you could file an anticipatory suit against the comp…
216 It's common for declaratory judgement regarding intellectual property ri…
217 https://en.wikipedia.org/wiki/Declaratory_judgment
218 .FE
219 .
220 .IP 4.
221 Win or lose the case.
222 .
223 .
224 .IP "Part Two: use that precedent against Microsoft's Nuance"
225 .
226 .PP
227 Our goal here is to get the same legal outcome as the low-stakes “tria…
228 .
229 .PP
230 Microsoft owns Nuance.
231 Nuance previously bought Eloquent Technology, the developers of the Eloq…
232 .
233 .IP 1.
234 Repeat Part One against Nuance speech synthesizers, including Eloquence.
235 Go to court.
236 .
237 .IP 2.
238 Have the ruling from Part One cited as legal precedent.
239 .
240 .IP 3.
241 Achieve the same outcome as Part One, demonstrating that we have indeed …
242 .
243 .
244 .IP "Implications of the outcomes"
245 .
246 .PP
247 If we \fIwin\fR both cases:
248 Microsoft has the legal high ground.
249 Making a derivative of a copyrighted work using a machine-learning algor…
250 .
251 .PP
252 If we \fIlose\fR both cases:
253 Microsoft does not have the legal high ground.
254 We have good judicial precedent against Microsoft to use when filing sui…
255 .
256 .PP
257 Either way, it's an absolute win for free software.
258 Taking down Copilot protects copyleft from enabling proprietary derivati…
259 But if we accidentally win these two low-stakes “test” cases, we sti…
260 we can liberate huge swaths of proprietary software, starting with speec…
261 .
262 .
263 .IP "Update: on satire"
264 .
265 .PP
266 This post isn't “satire through-and-through” like something from The…
267 Rather, my intent was to make some clear points, but extrapolate them to…
268 I don't think I was clear enough when doing this.
269 I'm sorry.
270 .
271 .PP
272 Copilot has been found to suggest significant amounts of code that is da…
273 It does this without disclosing obligations that come with those works' …
274 Training a model on copyrighted works may not be wrong in and of itself;…
275 Copilot's users could apply proprietary licenses to the generated works,…
276 .
277 .PP
278 When a tool almost exclusively encourages problematic behavior, the make…
279 GitHub and OpenAI have not demonstrated a sufficiently careful approach.
280 .
281 .PP
282 I don't think that “going after” a smaller player just to manipulate…
283 The fact that this idea seems plausible to some of my readers shows how …
284 Even if it's accurate (I doubt it's accurate, but I'm not certain), it's…
285 Judicial systems incentivise too much predatory behavior.
286 .
287 .
288 .IP "Corrections"
289 .
290 It's come to my attention that Eloquence may or may not still belong to …
291 Further research is needed.
292 Eloquent Technology was acquired by SpeechWorks in 2000.
You are viewing proxied material from bitreich.org. The copyright of proxied material belongs to its original authors. Any comments or complaints in relation to proxied material should be directed to the original authors of the content concerned. Please see the disclaimer for more details.