

Do you have some source for this IP thing in the EU? I wasn’t aware of any new privacy laws.
A software developer and Linux nerd, living in Germany. I’m usually a chill dude but my online persona doesn’t always reflect my true personality. Take what I say with a grain of salt, I usually try to be nice and give good advice, though.
I’m into Free Software, selfhosting, microcontrollers and electronics, freedom, privacy and the usual stuff. And a few select other random things, too.
Do you have some source for this IP thing in the EU? I wasn’t aware of any new privacy laws.
You’re right, the private data is a bit of a construed example. I just wanted to make the argument that this isn’t just about copyright law. Something could be fair use under copyright law, but still illegal to use for other reasons. Which is problematic when doing unsupervised web scraping, for example. It’s definitely an issue, but out of scope if we limit the discussion to copyright only.
I’ve just skimmed the last article, I’m going to read it tomorrow. But I don’t think I’d like to argue for extending copyright. I think that would be bad. But I think it’s debatable whether AI training falls into that category. I’m not sure how it is in different jurisdictions… Maybe it’s clear in the US? I always struggle to read American legislation. I can just say it’s not clear where I live. And that comes with consequences: Companies do AI in other countries like the USA or China, rather than in the EU. Which is an issue for our economy and scientific progress. And everywhere where law isn’t clear enough, that’s a disadvantage for smaller companies/institutions or individuals, since it’s the big companies who can afford lawyers easily. And it has consequences for me personally. For example Meta’s use policy for the newer Llama models excludes Europeans. I’m not allowed to use it. That might not be about copyright either, but definitely due to unclear regulations.
So I don’t advocate for extending copyright. My stance is, we don’t have clear regulation in the first place. I’d leave all exemptions and specifics in place. We can leave libraries, music, research and reverse engineering as is. But the current warfare is super unhealthy. We have some companies scraping everything, meanwhile other people come up with tarpits and countermeasures like Cloudflare with their AI labyrinth last week… One newer business model is introducing walled gardens so companies can make sure they’re the only one selling their userdata… I think that’s all very unhealthy. And it favors large companies doing that “research”. Meanwhile the internet gets flooded with slop, half the internet services are barely usable and we might end up with dystopian Skynet corporations dominating information flow anyways. And I think that’s the bigger issue than copyright. If AI proves to be disruptive, it needs to be used somewhat ethically. And I think the only way to do it is regulation. We need to even the field so research and non-profit gets a chance. We currently have “smaller” startups participating, and several companies release open-weight models. But we can’t rely on their altruism. My prediction is they’ll all stop once this starts to interfere with their business or those models get really useful. And then it’s going to be OpenAI and Anthropic & Co who get to decide what kind of information the world has access to. Which would be very bad. And they also offer little transparency. More and more people rely on these services and AI is very much a black box. And the large companies have stopped telling what went in a few years ago when all the copyright lawsuits started. The first Llama model still came with a scientific paper, detailing all the dataset that went in. But as far as I understand, they stopped soon after, when copyright lawsuits started. And the rest are trade secrets. So if someone were to use ChatGPT (which lots of people do), they’re completely at mercy of OpenAI. OpenAI get to decide in which ways the model is biased (or not), what it can and can not answer, what is fed to the users. I think that’s the main issue with it. (Along with slop.) Copyright of training data is some sort or sideshow. But I still think we have a lot of unaddressed issues with AI. And leaving that open is just going to help the big players. I think we need more clear regulation so a small company who can’t afford a lot of lawyers can also be 100% sure whether someting is fair use or whether it isn’t. And personally I think we need to hold them all accountable and force them to be more transparent with everything. Like a rough estimation of the datasets. And I’d force service of generative AI services to implement watermarking to at least try to tackle slop and people doing their homework with ChatGPT. Sure this can all be circumvented, but we can at least try to do something about it. And I’d also like if big companies bought at least one copy of a book they use to train their AI. Meta or OpenAI can afford to pay a few millions. Otherwise they just leech on people’s content. I think it’s unfair that some people take quite some time to write books, Reddit comments, Wikipedia articles and then someone else gets to make big profit from that. It’s not very straightforward to solve it, but I don’t think it’s very healthy for humanity to just hand over everything to greedy companies. And I also don’t think it’s healthy to embark in a warfare, which seems to happen right now. That way we’re likely to all lose access to free information.
Private conversations are something entirely different from publically available data
But that’s kind of the question here… Is data processing Fair Use in every case? If yes, we just also brought private conversations and everything in. If not: What are the requirements? We now need to talk about which use cases we deem legitimate and what gets handled how… I think that’s exactly what we’re discussing here. IMHO that’s the point of the debate… It’s either everything… or nothing… or we need to discuss the details.
Compensation for essentially making observations will inevitably lead to abuse of the system and deliver AI into the hands of the stupidly rich, something the world doesn’t need.
I’m not sure about that. I mean I gave some examples with licensing music on events and libraries (in general). Does that also get abused by the rich? I don’t think so. At least not that much, so that makes me think that might be a feasible approach. Of course it get’s more complicated than that. Licensing music for example brings in some collecting societies, and all those agencies have proven to be problematic in various ways and all the licensing industry isn’t exactly fair, they also mainly shove money into the hands of the rich… So a proper solution would be a bit more complicated than that.
I mean I’d like to agree with you here and have a straightforward parallel on how to deal with AI training datasets. But I don’t think it’s as easy as that. We can’t just say processing data is Fair Use, because there are a lot of details involved, as I said with privacy. We can’t process private data and just do whatever with it. We can’t do everything with copyrighted material, even if it’s in the public. If a use is legitimate already depends on the details. And I think the same applies to AI. It needs a more nuanced perspective than just allow or prohibit everything.
I’m not that educated on US law and if everything is subsumed under Fair Use. I believe in Germany, we have a seperate rule for ephemeral copies during data processing and network transfers ($44a UrhG). So we don’t have to deal with that using a law that was more concerned with someone photocopying a book. And I believe some countries distinguish between commercial interests and non-profit research. Plus we have exemptions for example allowing someone to play music on their non-profit events, even without consent of the copyright holder. They still need to pay them a “fair” amount, but it’s not up to the copyright holder to decide… We specify under what circumstances libraries can use content, again differentiate between interests, and we have a rudamentary law concerning data mining for research since 2017.
I think some specific laws like that would be more suited to guide the issue with AI towards a healthy solution, than use one blunt tool for everything. Why not say AI training is allowed, but it requires a fair compensation? We could even have a standardized way of opt-in or opt-out… I’m not sure if we need that. But I’m fine with my blog posts and Free Software projects end up in some AI. But I don’t want them to listen in on my private conversations, like for example an Alexa could do… I believe that requires a law distinguishing between that. If everything is Fair Use, I can say goodbye to privacy, but at the same time cancel my Netflix and Spotify subscription, since I’m going to claim, I’m just collecting all of that for future AI training.
I -personally- think we can’t allow Amazon to spy on me and just claim it’s fair use. So context matters. And I also think the goal and nature of the AI matters. Research needs to be less strict than commercial interest. And I don’t think networking of digital devices can be handled the same way as AI training, I strongly believe that requires separate laws and also needs to factor in if there is some legitimate interest to begin with.
Those are great links. I think I already read Cory Doctorow’s post.
I think I already struggle with the premise. I think Google, Facebook, etc using my data is NOT Fair Use. They can not just publish my full name, pictures and texts without my explicit consent.
And this is kind of lumping everything together again… For-profit AI and open-weight models to the benefit of humanity aren’t the same thing. And I think we should give open-weight models some advantage by applying different rules. I.e. let people use data more freely if they contribute something back and the resulting product can be used freely as well. And make the rules more strict for big and closed for-profit services. And demand more transparency as well.
I mean realistically, we don’t have any proper rules in place. The AI companies for example just pirate everything from Anna’s Archive. And they’re rich enough to afford enough lawyers to get away with that. And that’s unlike libraries, which pay for books and DVDs in their shelves… So that’s definitely illegal by any standard.
But I agree that learning something from a textbook is a different thing than copying it. The resulting knowledge escapes the copyrighted material. I believe that’s the same no matrer if it’s machine learning or me learning computer programming with textbooks… The thing is just that you can’t steal in the process. That’s still illegal. IMO.
One of my fears is that AI is as disruptive as people think. And that the market is going to be dominated by unsympathetic big-tech companies, due to the nature if it. I think we need some good legislation to push AI in the right direction, or we’re going to end up in some scifi dystopia where big companies just shape the world and our lives to their liking.
Actually there is a small but important difference. Libraries usually don’t generate (much) revenue. They’re funded by public money, institutions… And a subscription is like 25€ annually or it’s free… While AI companies have billions of dollars turnover and they’re very much for-profit. And Fair Use has other applications as well. It allows science, allows me to record television or listen to music in the car or together with friends. I don’t think we can lump all of this together.
Seems the Admin took some action over at the community and banned them for: sockpuppet account, disposable email, vote manipulation, ban evasion
If anything, this leads to more control, more tracking to enforce bans and less privacy for everyone. And adds toxicity. So my final verdict is: No. This takes away privacy.
Idk. The last accounts I suspect to be that person are:
I don’t see the point. We could go some more through the comicstrips community and the modlog and puzzle the pieces back together… But I really don’t see what kind of privacy this offers. I mean this strange behaviour kind if draws more attention, not less… Maybe they’d like to chime in to tell us.
Could also be a person with behavioural pecularities, or they’re high on drugs.
I see. What an asshole. It’d be best if you brought that up with the sh.itjust.works admins. Maybe they would like to ban them for stalking people, ban-evasion or whatever. And/or [email protected] if you want more attention and drama.
I’d say this is unacceptable behaviour.
I highly doubt this kind of behaviour makes Lemmy a better place. We have other places for these kinds of things linke anonymous imageboards.
Has it come down to changes like this to be noteworthy, or what kind of news article is this? I kind of want my minute back reading this.
Hehe, I don’t know if they don’t want to understand it, or if it’s a lack of technical knowledge… But yes. In the digital realm, a copy and the original are identical in every way, no matter how you twist it. And you can’t even properly transfer any item it in the same sense as it applies to physical items. (Unless we’re talking about quantum computers or something like that…)
Sorry, this just isn’t correct. Yes, you can ask for almost anything and it’ll be alright and merely asking a question is completely legal.
The issue is, you then proceed to do a second step. And that is transferring the data. And that is a separate thing. You then initiate the actual transfer. Your computer actively does that. It keeps the transfer going and recieves the network packets. It literally copies them into RAM and then copies them again onto your harddrive. To make your local copy. The uploader merely reads it from their harddisk and hands it out, they do one copy operation less. Though they’re still the distributor.
I think any expert witness would testify in court, that your computer as the downloader does two copy operations, at least in the technical sense of the term. And that you’ve ultimately also initiated the transfer as the downloader due to how TCP/IP works.
The thumbdrive example is a bit construed. I think you might get away with that, though. Unless you plug it into your computer. Because then all the copying to RAM and harddrive etc starts again. But I think just pocketing it is posession (which doesn’t seem to be wrong), and not necessarily copying.
But like: how do other laws work where you live? Can you instruct someone to do something illegal and you’re fine? I can’t come up with anything normal, let’s say I hire someone to kidnap my child/wife to teach them a lesson. Or I hire a hitman to kill my arch enemy. Am I fine dong that? It’s a bit over the top. But where I live I can certainly get into trouble if I make people do something on my behalf. Which I’d argue doesn’t exactly happen here. It’s a bit more complicated… But your concept of law doesn’t seem to make much sense to me.
Well, did the uploader push it onto my computer, or was it me who clicked on something and initiated the transfer? I’d say it’s the latter. So the downloader initiated the copying process… I mean if I steal an orange in the supermarket, we also don’t say it fell into my hands and somehow they’re guilty…
And additionally you might find other local laws like this: https://www.nysenate.gov/legislation/laws/PEN/156.35
and generally, this is related to wire fraud, computer fraud, unlawfully obtaining information from a protected computer… Which all seem to be crimes and something people get charged with if someone wants them convicted of a crime. See Aaron Swartz for example.
In Germany we definitely have this:
and we have Störerhaftung which can also get you in trouble even if you didn’t do it yourself.
In the USA it’d probably be something like https://www.law.cornell.edu/uscode/text/17/106
which says the copyright holder has exclusive rights (1) to reproduce the copyrighted work in copies […]
Which is kind of what you do when downloading something. There will be a copy on your harddisk… And (1) does NOT limit this to redistribution, like (3)… I’m not sure how it turns out in practice. I don’t follow American court rulings that closely.
These restrictions are meant to forbid what other people can do. So you yourself can do anything with your content, no matter what’s in the license. It just means other people can’t use it in their projects if relates to making money. But I think something like CC BY-SA or BY-NC-SA is a solid choice. I’m always for freedom, so I’d drop the NC and allow my audience to do practically whatever they want… But it’s your creation and your choice.
https://creativecommons.org/share-your-work/
Read that for a start. And maybe consider sharing your work under a nice license. You can also check out platforms like Jamendo, Bandcamp, the Internet Archive… And as far as I understand archive.org will even handle generating some torrents for you.
Yeah, I thought about including a NordVPN advertisement in my comment… Do people do that? I mean sure it’s an additional privacy technique. But I regularly just talk about whatever legal things I like and I don’t take this too seriously. It’s my regular internet connection. I make a clean cut after that, though. If I now were to download something, I’d think about not doing it in plain sight. And I’d use a different username to sign up somewhere if needed.
Though, I forgot to talk about different jurisductions. And for example where I live, I can’t give someone instructions on how to commit a crime. Or help doing illegal things. So that’s another thing I avoid.
And I’m kind of talking about: talking about piracy. Not doing it. If you share links in an underground forum, or upload some movies, you’d need way more security measures in place.
I’d say the rules of such a community usually factor that in. Don’t directly link to pirated content etc. So the first thing is to read the rules.
It’s also always a good idea to watch other people. See what they do and how they talk before you ask where to find the latest pirated Nintendo collection, a mere minutes after creating an account.
General media literacy applies. Choose a pseudonym and not your real name. Don’t attach your main email address and phone number to your piracy accounts. And maybe generally don’t mix all your online life together.
And some mild trick is to phrase things so you don’t outright admit to committing illegal activities. Say you think a friend of yours did this, or what would happen if someone were to do that… But this doesn’t really change anything. People can tell. And if law enforcement takes interest in you, they don’t really care how you phrased things after they found evidence.
But talking about piracy isn’t illegal. And avoiding names and links sometimes is more about protecting the server you’re on, since the place could get closed if it’s the where the actual pirating happens. So in everyones interest, I think we talk about piracy here, but don’t do it here…
And there is a Megathread, a FAQ etc in the sidebar. With all sorts of good information. And lots of links. So if you read that, you can skip asking some questions in the first place.
Uh, thanks. That really doesn’t look good. Usually copyright infringement is a civil matter. And I believe we had sufficient laws to handle that in European countries. I haven’t read the cited new law, but I guess that “shortcut” just does away with everyone’s privacy. Plus it’s going to swamp the courts with cases. I’m not sure if they’re bored or anything… But either they just hand out fines without checking properly… Or, if done properly, this is just a lot of additional work for the justice system. To the benefit of the copyright industry. And either way, it’s just bad for the people.
Edit: I believe this is the mentioned government gazette. The copyright changes are in Chapter 2: https://www.e-nomothesia.gr/kat-arxaiotites/n-5179-2025.html