[–] Hamartiogonic@sopuli.xyz 141 points 1 year ago (2 children)

Text written before 2023 is going be exceptionally valuable because that way we can be reasonably sure it wasn’t contaminated by an LLM.

This reminds me of some research institutions pulling up sunken ships so that they can harvest the steel and use it to build sensitive instruments. You see, before the nuclear tests there was hardly any radiation anywhere. However, after America and the Soviet Union started nuking stuff like there’s no tomorrow, pretty much all steel on Earth has been a little bit contaminated. Not a big issue for normal people, but scientists building super sensitive equipment certainly notice the difference between pre-nuclear and post-nuclear steel

[–] Eheran@lemmy.world 46 points 1 year ago (1 children)

The background radiation did go up, but saying "there was hardly any radiation anywhere" is wrong. Today's steel (and background radiation) is pretty much back to pre-nuke levels. Low-background steel Background radiation

[–] evatronic@lemm.ee 26 points 1 year ago (2 children)

It is also worth nothing that we can make low or no radiation-contaminated steel, it's just really expensive and hard and happens in very low quantities.

load more comments (2 replies)

[–] lily33@lemmy.world 13 points 1 year ago (1 children)

Not really. If it's truly impossible to tell the text apart, than it doesn't really pose a problem for training AI. Otherwise, next-gen AI will be able to tell apart text generated by current gen AI, and it will get filtered out. So only the most recent data will have unfiltered shitty AI-generated stuff, but they don't train AI on super-recent text anyway.

[–] Womble@lemmy.world 30 points 1 year ago (4 children)

This is not the case. Model collapse is a studied phenomenon for LLMs and leads to deteriorating quality when models are trained on the data that comes from themselves. It might not be an issue if there were thousands of models out there but there are only 3-5 base models that all the others are derivatives of IIRC.

[–] lily33@lemmy.world 8 points 1 year ago* (last edited 1 year ago) (3 children)

I don't see how that affects my point.

Today's AI detector can't tell apart the output of today's LLM.
Future AI detector WILL be able to tell apart the output of today's LLM.
Of course, future AI detector won't be able to tell apart the output of future LLM.

So at any point in time, only recent text could be "contaminated". The claim that "all text after 2023 is forever contaminated" just isn't true. Researchers would simply have to be a bit more careful including it.

[–] Womble@lemmy.world 13 points 1 year ago (3 children)

Your assertion that a future AI detector will be able to detect current LLM output is dubious. If I give you the sentence "Yesterday I went to the shop and bought some milk and eggs." There is no way for you or any detection system to tell if that was AI generated or not with any significant degree of certainty. What can be done is statistical analysis of large data sets to see how they "smell", but saying around 30% of this dataset is likely LLM generated does not get you very far in creating a training set.

I'm not saying that there is no solution to this problem, but blithely waving away the problem saying future AI will be able to spot old AI is not a serious take.

load more comments (3 replies)

load more comments (2 replies)

load more comments (3 replies)

[–] Peanutbjelly@sopuli.xyz 79 points 1 year ago (3 children)

The wording of every single article has such an anti AI slant, and I feel the propaganda really working this past half year. Still nobody cares about advertising companies, but LLMs are the devil.

Existing datasets still exist. The bigger focus is in crossing modalities and refining content.

Why is the negative focus always on the tech and not the political system that actually makes it a possible negative for people?

I swear, most of the people with heavy opinions don't even know half of how the machines work or what they are doing.

[–] _jonatan_@lemmy.world 81 points 1 year ago (4 children)

Probably because LLMs threaten to (and has already started to) shittify a truly incredible number of things like journalism, customer service, books, scriptwriting etc all in the name of increased profits for a tiny few.

[–] Peanutbjelly@sopuli.xyz 57 points 1 year ago (6 children)

again, the issue isn't the technology, but the system that forces every technological development into functioning "in the name of increased profits for a tiny few."

that has been an issue for the fifty years prior to LLMs, and will continue to be the main issue after.

removing LLMs or other AI will not fix the issue. why is it constantly framed as if it would?

we should be demanding the system adjust for the productivity increases we've already seen, as well to what we expect in the near future. the system should make every advancement a boon for the general populace, not the obscenely wealthy few.

even the fears of propaganda. the wealthy can already afford to manipulate public discourse beyond the general public's ability to keep up. the bigger issue is in plain sight, but is still being largely ignored for the slant that "AI is the problem."

[–] p03locke@lemmy.dbzer0.com 22 points 1 year ago

Yep, the problem was never LLMs, but billionaires and the rich. The problems have always been the rich for thousands of years, and yet they are immensely successful at deflecting their attacks to other groups for those thousands of years. They will claim it's Chinese immigrants, or blacks, or Mexicans, or gays, or trans people. Now LLMs and AI are the new boogieman.

We should be talking about UBI, not LLMs.

[–] Gutless2615@ttrpg.network 21 points 1 year ago (1 children)

It’s a capitalism problem not an AI or copyright problem.

[–] agitatedpotato@lemmy.world 7 points 1 year ago* (last edited 1 year ago) (2 children)

Sure but lets say you try to solve this problem. What's the first thing you think a coordinated group could do, get sensible regulations about AI, or overthrow global capitalism. Its framed the way it is because unless you want ro revolt that's the framework we're gonna have to use to deal with it. I suppose we could alwyas do nothing to AI specifically and focus on just overthrowing capitalism, but during that time lots of harm will come to lots of workers because of AI use. I dont think anticapitalism has reached a critical mass (we need this for any real sustem wide attacks on and alternatives to capitalism) so I think dealing with this AI problem and trying to let everyone else know about how it's really a capitalism thing would do more to build support and avert harm to workers. I hate that its like that too but those choices are basically the real options we have moving forward from my pov.

[–] Gutless2615@ttrpg.network 12 points 1 year ago* (last edited 1 year ago) (1 children)

You tell me what "sensible regulations about AI" are that don't hurt small artists and creators more than they centralize the major players and enrich copyright hoarding, copyright-maximalist corporations. (Seriously, this isn’t bait. I’ve been wracking my mind on the issue for months. Because the only serious proposals so far are expanding the already far-too-broad copyright rights to things like covering training or granting artists more rights to their work during their lifetime - something that will only hurt small artists) We desperately need more fair use, not less. The only "sensible regulations" that we should and could be talking about is some form of UBI. That's it.

[–] agitatedpotato@lemmy.world 5 points 1 year ago* (last edited 1 year ago) (1 children)

UBI is a bandaid that doesn't solve the core issues of production under capitalism, the people with capital still control production, still make more money than eveyone else and still have more money and power to use influencing the politicians that write the laws surrounding UBI. And expecting me to solve the AI problem in a comment section is like me asking you to implement UBI in a way that landlords dont just jack up rent or business dont inflate prices with more cash and demand floating around, also whats your plan for when the level of UBI legislated , or planned increases in UBI is no longer sufficient enough to pay for housing food and other necessities? What do you do to counter the fact that the capitists still have more access to politicians and media empires they can use to discredit and remove UBI?

[–] Gutless2615@ttrpg.network 7 points 1 year ago* (last edited 1 year ago) (10 children)

UBI is a bandaid, sure. But bandaids actually help; “sensible AI regulations” - a nothing phrase that will most likely materialize as yet another expansion of copyright — will actively make things worse. UBI is achievable, and can be expanded on once it’s enacted. You establish protections and regulations that actually help people, and dare opposition to ever try to take them away; instead of carrying water for copyright maximalists along the way.

[–] p03locke@lemmy.dbzer0.com 6 points 1 year ago

a nothing phrase that will most likely materialize as yet another expansion of copyright

Exactly. We need to break apart copyright with a crowbar. It's a broken system that only benefits the rich, and AI has the opportunity to turn the entire system into a pile of unenforceable garbage.

load more comments (9 replies)

[–] p03locke@lemmy.dbzer0.com 6 points 1 year ago (2 children)

get sensible regulations about AI

There's no such thing as "sensible regulations" for AI. AI is a technological advantage. Any time you regulate that advantage, other groups that don't have those regulations will fuck you over. Even if you start talking about regulations, the corpos will take over and fuck you over with regulations that only hurt the little guy.

Hell, even without regulations, we're already seeing this on the open-source vs. capitalism front. Google admitted that it lost some advantages because of open-source AI tools, and now these fucking cunts are trying to hold on to their technology as close as possible. This is technology that needs to be free and open-source, and we're going to see a fierce battle with multi-billion-dollar capitalistic corporations clawing back whatever technological gains OSS acquired, until you're forced to spend hundreds or thousands of dollars to use a goddamn chess bot.

GPLv3 is key here, and we need to force these fuckers into permanent copyleft licenses that they can't revoke. OpenAI is not open, StabilityAI is not the future, and Google is not your friend.

load more comments (2 replies)

[–] jackoneill@lemmy.world 6 points 1 year ago

This isn’t a technological issue, it’s a human one

I totally agree with everything you said, and I know that it will never ever happen. Power is used to get more power. Those in power will never give it up, only seek more. They intentionally frame the narrative to make the more ignorant among us believe that the tech is the issue rather than the people that own the tech.

The only way out of this loop is for the working class to rise up and murder these cunts en masse

Viva la revolucion!

[–] Fried_out_Kombi@lemmy.world 5 points 1 year ago* (last edited 1 year ago) (1 children)

Exactly. I work in AI (although not the LLM kind, just applying smaller computer vision models), and my belief is that AI can be a great liberator for humanity if we have the right political and economic apparatus. The question is what that apparatus is. Some will say it's an inherent feature of capitalism, but that's not terribly specific, nor does it explain the relatively high wealth equality that existed briefly during the middle of the 20th century in America. I think some historical context is important here.

Historical Precedent

During the Industrial Revolution, we had an unprecedented growth in average labor productivity due to automation. From a naïve perspective, we might expect increasing labor productivity to result in improved quality of life and less working hours. I.e., the spoils of that productivity being felt by all.

But what we saw instead was the workers lived in squalor and abject poverty, while the mega-rich captured those productivity gains and became stupidly wealthy.

Many people at the time took note of this and sought to answer this question: why, in an era over greater-than-ever labor productivity, is there still so much poverty? Clearly all that extra wealth is going somewhere, and if it's not going to the working class, then it's evidently going to the top.

One economist and philosopher, Henry George, wrote a book exploring this very question, Progress and Poverty. His answer, in short, was rent-seeking:

Rent-seeking is the act of growing one's existing wealth by manipulating the social or political environment without creating new wealth.[1] Rent-seeking activities have negative effects on the rest of society. They result in reduced economic efficiency through misallocation of resources, reduced wealth creation, lost government revenue, heightened income inequality,[2] risk of growing political bribery, and potential national decline.

Rent-seeking takes many forms. To list a few examples:

Land speculation
Monopolization of finite natural resources (e.g., oil, minerals)
Offloading negative externalities (e.g., pollution)
Monopolization of intellectual property
Regulatory capture
Monopolistic or oligopolistic control of entire markets

George's argument, essentially, was that the privatization of the economic rents borne of god-given things — be it land, minerals, or ideas — allowed the rich and powerful to extract all that new wealth and funnel it into their own portfolios. George was not the only one to blame these factors as the primary drivers of sky-high inequality; Nobel-prize winning economist Joseph Stiglitz has stated:

Specifically, I suggest that much of the increase in inequality is associated with the growth in rents — including land and exploitation rents (e.g., arising from monopoly power and political influence).

George's proposed remedies were a series of taxes and reforms to return the economic rents of those god-given things to society at large. These include:

Implementation of land value taxes:

Land value taxes are generally favored by economists as they do not cause economic inefficiency, and reduce inequality.[2] A land value tax is a progressive tax, in that the tax burden falls on land owners, because land ownership is correlated with wealth and income.[3][4] The land value tax has been referred to as "the perfect tax" and the economic efficiency of a land value tax has been accepted since the eighteenth century.

Implementation of Pigouvian (aka externality) taxes, e.g., carbon tax:

A Pigouvian tax (also spelled Pigovian tax) is a tax on any market activity that generates negative externalities (i.e., external costs incurred by the producer that are not included in the market price). The tax is normally set by the government to correct an undesirable or inefficient market outcome (a market failure) and does so by being set equal to the external marginal cost of the negative externalities. In the presence of negative externalities, social cost includes private cost and external cost caused by negative externalities. This means the social cost of a market activity is not covered by the private cost of the activity. In such a case, the market outcome is not efficient and may lead to over-consumption of the product.[1] Often-cited examples of negative externalities are environmental pollution and increased public healthcare costs associated with tobacco and sugary drink consumption.[2]

Implementation of severance taxes,

Severance taxes are taxes imposed on the removal of natural resources within a taxing jurisdiction. Severance taxes are most commonly imposed in oil producing states within the United States. Resources that typically incur severance taxes when extracted include oil, natural gas, coal, uranium, and timber. Some jurisdictions use other terms like gross production tax.

such as in the Norwegian model:

The key to Norway’s success in oil exploitation has been the special regime of ownership rights which apply to extraction: the severance tax takes most of those rents, meaning that the people of Norway are the primary beneficiaries of the country’s petroleum wealth. Instead of privatizing the resource rents provided by access to oil, companies make their returns off of the extraction and transportation of the oil, incentivizing them to develop the most efficient technologies and processes rather than simply collecting the resource rents. Exploration and development is subsidized by the Norwegian government in order to maximize the amount of resource rents that can be taxed by the state, while also promoting a highly competitive environment free of the corruption and stagnation that afflicts state-controlled oil companies.

Intellectual property reform, e.g., abolishing patents and instead subsidizing open R&D, similar to a Pigouvian anti-tax (research has positive externalities) or Norway's subsidization of oil exploration
Implementation of a citizen's dividend or universal basic income, e.g., the Alaska permanent fund or carbon tax-and-dividend:

Citizen's dividend is a proposed policy based upon the Georgist principle that the natural world is the common property of all people. It is proposed that all citizens receive regular payments (dividends) from revenue raised by leasing or taxing the monopoly of valuable land and other natural resources.

...

This concept is a form of universal basic income (UBI), where the citizen's dividend depends upon the value of natural resources or what could be titled as common goods like location values, seignorage, the electromagnetic spectrum, the industrial use of air (CO2 production), etc.[4]

Funding public goods via the Henry George Theorem:

In 1977, Joseph Stiglitz showed that under certain conditions, beneficial investments in public goods will increase aggregate land rents by at least as much as the investments' cost.[1] This proposition was dubbed the "Henry George theorem", as it characterizes a situation where Henry George's 'single tax' on land values, is not only efficient, it is also the only tax necessary to finance public expenditures.[2] Henry George had famously advocated for the replacement of all other taxes with a land value tax, arguing that as the location value of land was improved by public works, its economic rent was the most logical source of public revenue.[3]

...

Subsequent studies generalized the principle and found that the theorem holds even after relaxing assumptions.[4] Studies indicate that even existing land prices, which are depressed due to the existing burden of taxation on labor and investment, are great enough to replace taxes at all levels of government.[5][6][7]

(continued)

[–] Fried_out_Kombi@lemmy.world 6 points 1 year ago* (last edited 1 year ago) (4 children)

Present Day

Okay, so that's enough about the past. What about now?

Well, monopolization of land and housing via the housing crisis has done tremendous harm:

In 2015, two talented professors, Enrico Moretti at Berkeley and Chang-Tai Hsieh at Chicago Booth, decided to estimate the effect of shortage of housing on US productivity. They concluded that lack of housing had impaired US GDP by between 9.5 per cent and 13.5 per cent.

In a follow-up paper, based on surveying 220 metropolitan areas, they revised the figure upwards – claiming that housing constraints lowered aggregate US growth by more than 50 per cent between 1964 and 2009. In other words, they estimate that the US economy would have been 74 per cent larger in 2009, if enough housing had been built in the right places.

How does that damage happen? It’s simple. The parts of the country with the highest productivity, like New York and San Francisco, also had stringent restrictions on building more homes. That limited the number of homes and workers who could move to the best job opportunities; it limited their output and the growth of the companies who would have employed them. Plus, the same restrictions meant that it was more expensive to run an office or open a factory, because the land and buildings cost more.

And that is just one form of rent-seeking. Imagine the collective toll of externalities (e.g., the climate crisis), monopolistic/oligopolistic markets such as energy and communications, monopolization of valuable intellectual property, etc.

So I would tend to say that — unless we change our policies to eliminate the housing crisis, properly price in externalities, eliminate monopolies, encourage the growth of free and open IP (e.g., free and open-source software, open research, etc.), and provide critical public goods/services such as healthcare and education and public transit — we are on a trajectory for AI to be Gilded Age 2: Electric Boogaloo. AI merely represents yet another source of productivity growth, and its economic spoils will continue to be captured by the already-wealthy.

I say this as someone who works as an AI and machine learning research engineer: AI alone will not fix our problems; it must be paired with major policy reform so that the economic spoils of progress are felt by all, not just the rich.

Joseph Stiglitz, in the same essay I referred to earlier, has this to say:

My analysis of market models suggests that there is no inherent reason that there should be the high level of inequality that is observed in the United States and many other advanced countries. It is not a necessary feature of the market economy. It is politics in the 21st century, not capitalism, which is at fault. Market and political forces have, of course, always been interwined. Especially in America, where our politics is so money-driven, economic inequalities translate into political inequality.

There is nevertheless considerable hope. For if the growth of inequality was largely the result of inexorable economic laws, public policy could do little more than lean against the wind. But if the growth of inequality is the result of public policy, a change in those policies could lead to an economy with less inequality, and even stronger growth.

load more comments (4 replies)

[–] Ostrichgrif@lemmygrad.ml 5 points 1 year ago

I completely agree with you, ai should be seen as a great thing, but we all know that the society we live in will not pass those benefits to the average person, in fact it'll probably be used to make life worse. From a leftist perspective it's very easy to see this, but from the Norman position, atleast in the US, people aren't thinking about how our society slants ai towards being evil and scary, they just think ai is evil and scary. Again I completely agree with what you've said it's just important to remember how reactionary the average person is.

[–] glockenspiel@lemmy.world 4 points 1 year ago* (last edited 1 year ago)

It is a completely understandable stance in the face of the economic model, though. Your argument could be fitted to explain why firearms shouldn’t be regulated at all. It isn’t the technology, so we should allow the sale of actual machine guns (outside of weird loopholes) and grenade launchers.

The reality is that the technology is targeted by the people affected by it because we are hopeless in changing the broader system which exists to serve a handful of parasitic non-working vampires at the top of our societies.

Edit: not to suggest that I’m against AI and LLM. I want my fully automated luxury communism and I want it now. However, I get why people are turning against this stuff. They’ve been fucked six ways from Sunday and they know how this is going to end for them.

Plus, a huge amount of AI doomerism is being pushed by the entrenched monied AI players, like OpenAI and Meta, in order to used a captured government to regulate potential competition out of existence.

load more comments (3 replies)

[–] mimichuu_@lemm.ee 6 points 1 year ago (6 children)

I am so tired of techno-fetishist AI bros complaining every single time any of the many ways in which AI will devastate and rot out daily lives is brought up.

"It's not the tech! It's the economic system!"

As if they're different things? Who is building the tech? Who is pouring billions into the tech? Who is protecting the tech from proper regulation, smartass? I don't see any worker coops using AI.

"You don't even know how it works!"

Just a thought terminating cliche to try to avoid any discussion or criticism of your precious little word generators. No one needs to know how a thing works to know it's effects. The effects are observable reality.

Also, nobody cares about advertising companies? What the hell are you on about?

load more comments (6 replies)

load more comments (1 replies)

[–] art@lemmy.world 30 points 1 year ago (1 children)

We built a machine to mimic human writing. There's going to a point where there is no difference. We might already be there.

[–] MyUnclesSecret@lemmy.world 13 points 1 year ago (1 children)

The machine used to mimic human text uses human text. If it can't find the difference in it's text and human text, it will begin using AI text to mimic human text. This will eventually lead to errors, repetitions, and/or less human like text.

load more comments (1 replies)

[–] RandomlyAssigned@lemmy.world 24 points 1 year ago

On the one hand, our AI is designed to mimic human text, on the other hand, we can detect AI generated text that was designed to mimic human text. These two goals don't align at a fundamental level

[–] BackupRainDancer@lemmy.world 24 points 1 year ago* (last edited 1 year ago) (7 children)

Predictable issue if you knew the fundamental technology that goes into these models. Hell it should have been obvious it was headed this way to the layperson once they saw the videos and heard the audio.

We're less sensitive to patterns in massive data, the point at which we cant tell fact from ai fiction from the content is before these machines can't tell. Good luck with the FB aunt's.

GANs final goal is to develop content that is indistinguishable... Are we surprised?

Edit since the person below me made a great point. GANs may be limited but there's nothing that says you can't setup a generator and detector llm with the distinct intent to make detectors and generators for the sole purpose of improving the generator.

[–] throwsbooks@lemmy.ca 22 points 1 year ago (2 children)

For laymen who might not know how GANs work:

Two AI are developed at the same time. One that generates and one that discriminates. The generator creates a dataset, it gets mixed in with some real data, then that all of that gets fed into the discriminator whose job is to say "fake or not".

Both AI get better at what they do over time. This arms race creates more convincing generated data over time. You know your generator has reached peak performance when its twin discriminator has a 50/50 success rate. It's just guessing at that point.

There literally cannot be a better AI than the twin discriminator at detecting that generator's work. So anyone trying to make tools to detect chatGPT's writing is going to have a very hard time of it.

[–] BackupRainDancer@lemmy.world 6 points 1 year ago

Fantastically put!

load more comments (1 replies)

load more comments (6 replies)

[+] Ogmios@lemmy.world 22 points 1 year ago* (last edited 1 year ago) (3 children)

[deleted]

load more comments (3 replies)

[–] thebestaquaman@lemmy.world 14 points 1 year ago

This just illustrates the major limitation of ML: Access to reliable training data. A machine that has no concept of internal reasoning can never be truly trusted to solve novel problems, and novel problems, from minor issues to very complex ones, are solved in a bunch of professions every day. That's what drives our world forward. If we rely too heavily on AI to solve problems for us, the issue of obtaining reliable training data to train future AI's will only expand. That's why I currently don't think AI's will replace large swaths of the work force, but to a larger degree be used as a tool by the humans in the workforce.

[–] ChrislyBear@lemmy.world 13 points 1 year ago (3 children)

So every accusation of cheating/plagiarism etc. and the resulting bad grades need to be revised because the AI checker incorrectly labelled submissions as "created by AI"? OK.

[–] Peanutbjelly@sopuli.xyz 8 points 1 year ago* (last edited 1 year ago)

i laughed pretty hard when south park did their chatgpt episode. they captured the school response accurately with the shaman doing whatever he wanted, in order to find content "created by AI."

load more comments (2 replies)

[–] average650@lemmy.world 11 points 1 year ago (1 children)

I mean, the entire goal of the technology was to create human-like text.

load more comments (1 replies)

[–] Techmaster@lemmy.world 9 points 1 year ago

Relax, everybody. I have figured out the solution. We pass a law that all AI generated text has to be in Pig Latin or Ubbi Dubbi.

[–] professor_entropy@lemmy.world 7 points 1 year ago* (last edited 1 year ago)

FWIW It's not clear cut if AI generated data feeding back into further training reduces accuracy, or is generally harmful.

Multiple papers have shown that generated images by high quality diffusion models with a proportion of real images in mix (30-50%) improve the adversarial robustness of the models. Similiar things might apply to language modeling.

[–] kvothelu@lemmy.world 7 points 1 year ago (2 children)

i wonder why Google is still not considering buying reddit and other forums where personal discussion takes place and most user base sort quality content free of charge. it has been established already that Google queries are way more useful when coupled with reddit

[–] Lenins2ndCat@lemmy.world 16 points 1 year ago (1 children)

Making google better is not google's goal. Growth is their goal.

[–] angstylittlecatboy@reddthat.com 7 points 1 year ago

I'm honestly under the impression Google Search is one of their less valuable products, even if it's the one everyone associates the company's name with.

[–] howrar@lemmy.ca 6 points 1 year ago (1 children)

Why buy it when you can get the same data for free?

[–] MercuryUprising@lemmy.world 4 points 1 year ago

Why buy data for accuracy when you don't care and support your company with seo spam?

[–] Matriks404@lemmy.world 6 points 1 year ago

I wonder if AI generated texts (or speech) will impact our language. Kinda interesting thing to think about.

[–] Toneswirly@lemmy.world 5 points 1 year ago (1 children)

OpenAI also financially benefits from keeping the hype training rolling. Talking about how disruptive their own tech is gets them attention and investments. Just take it with a grain of salt.

[–] diffuselight@lemmy.world 6 points 1 year ago (8 children)

Its not possible to tell AI generated text from human writing at any level of real world accuracy. Just accept that.

load more comments (8 replies)

[–] cerevant@lemmy.world 4 points 1 year ago (1 children)

If it could, it couldn’t claim that the content out produced was original. If AI generated content were detectable, that would be a tacit admission that it is entirely plagiarized.

[–] howrar@lemmy.ca 7 points 1 year ago (3 children)

Being detectable does not mean plagiarism. The way they did it was by using a fixed rule for generating high entropy words. These are words that can be replaced with a large number of different words without changing the meaning of the sentence. Given any original passage of text, it's very unlikely for those words to all exactly follow the rule set by the generator, but a generated text will always have this rule followed, so they can be distinguished. Likewise, You can take any original passage and replace words in this fashion to increase the odds of it being detected as AI generated and the resulting text will still be original text.

load more comments (3 replies)

Technology

Our Rules

Approved Bots

Historical Precedent

Present Day