this post was submitted on 27 May 2024
81 points (94.5% liked)

Privacy

32107 readers
569 users here now

A place to discuss privacy and freedom in the digital world.

Privacy has become a very important issue in modern society, with companies and governments constantly abusing their power, more and more people are waking up to the importance of digital privacy.

In this community everyone is welcome to post links and discuss topics related to privacy.

Some Rules

Related communities

much thanks to @gary_host_laptop for the logo design :)

founded 5 years ago
MODERATORS
 

Is it possible to blog in the AI era?

I write short stories every now and then and I throw them online. I also have a tech blog, where I moan about the decisions software I use make and with my "infinite wisdom", I tell them what they should be doing instead.

I used to host both on Medium, but Medium got greedy. Then it was WordPress, but now even they're trying to be greedy bastards and use my shit for training AI.

Some would argue that WordPress paid hosting will exempt me from the AI training, but for less than 100 visitors a year, it's not really worth the expense.

So what is the solution? I ask the greater minds of this community for suggestions.

top 50 comments
sorted by: hot top controversial new old
[–] [email protected] 33 points 6 months ago (1 children)

Host your own stuff. With this little load you can do it on your own hardware with very little resources.

[–] [email protected] 2 points 6 months ago (2 children)

Yeah, I was thinking about throwing something on my Raspberry Pi, but didn't know if I'd open the door to more issues.

[–] [email protected] 4 points 6 months ago (5 children)

It can be pretty secure if you host it behind a cloudflare tunnel. Then you don't have to open any ports to the wild west

load more comments (5 replies)
[–] [email protected] 3 points 5 months ago

You could also spin up a $5 a month VPS somewhere like Linode.

[–] [email protected] 24 points 6 months ago (2 children)

I think you should clarify the problem first.

Privacy? You lose your privacy the moment you publish your blog anyway.

Is it visibility? You never expected Google to show your blog in most cases.

AI training? You could self-host and hope companies respect your robot.txt. But what's the actual problem if you released your blog to the public in the first place? Anybody could've copy & pasted your blog also before this AI era.

[–] [email protected] 3 points 6 months ago (1 children)

Privacy?

Privacy is of course my major concern, hence posting to this community. But not tinfoil hat level.

visibility?

I'm happy to have my stuff indexed by Google, in fact, I want it to be.

AI training?

I'll take that for 500!

Anybody could've copy & pasted your blog also before this AI era.

Plagiarism has been an issue since before Confucius was copied by Baffledus. But the cream still rose to the top. However in this AI era, everything is buried as its all just considered a part of the source data.

robot.txt.

Stories keep popping up about AI ignoring robots.

[–] [email protected] 9 points 6 months ago (1 children)

But... have we ever had privacy with blog articles? I mean the public ones.

[–] [email protected] 5 points 6 months ago

I guess it comes down to what your definition of privacy is. I'm setting the bar low, I just don't want to be used to train a large language model

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago) (1 children)

Privacy? You lose your privacy the moment you publish your blog anyway.

Oh, right, I'm gonna just reinstall facebook on the phone because I've lost everything... Oh and we have lost all of privacy by commenting on the internet and stepping out of the house! All resistance is futile! We need to close this community before people waste more of their time!

This is not at all how it works. How would you lose privacy if you only publish what you want to publish? It's entirely your decision what to include in your blog post.

[–] [email protected] 1 points 5 months ago

You're right, but I was talking specifically about blogs.

[–] popcar2 22 points 6 months ago* (last edited 6 months ago) (1 children)

There are two good options: Host your own blog yourself, or join a blogging platform that isn't corporate. I personally use BearBlog but I've heard good things about Write.as as well. These two have free blogging options and don't sell your data. If you want to host it yourself (which is safer), check out Hugo.

Ultimately, bots scrape the entire internet and there's no guarantee they will honor robots.txt of a particular website (which tells bots what they are and aren't allowed to do). If it's on the internet, people can scrape your content and there isn't much you can do about it. That shouldn't stop you from writing or blogging, just don't post very personal data.

Also, feel free to join us on [email protected]!

[–] [email protected] 5 points 6 months ago* (last edited 6 months ago)

When I was looking into Ghost over the weekend, Hugo kept popping up.

Also subscribed.

[–] [email protected] 11 points 5 months ago (1 children)

So lets be clear - there is no way to prevent others from crawling your website if they really want to (AI or non AI).

Sure you can put up a robots.txt or reject certain user agents (if you self host) to try and screen the most common crawlers. But as far as your hosting is concerned the crawler for AI is not too different from e.g. the crawler from google that takes piece of content to show on results. You can put a captcha or equivalent to screen non-humans, but this does not work that well and might also prevent search engines from finding your site (which i don't know if you want?).

I don't have a solution for the AI problem, as for the "greed" problem, I think most of us poor folks do one of the following:

  • github pages (if you don't like github then codeberg or one of the other software forges that host pages)
  • self host your own http server if its not too much of an hassle
  • (make backups, yes always backups)

Now for the AI problem, there are no good solutions, but there are funny ones:

  • write stories that seem plausible but hold high jinx in there - if there ever was a good reason for being creative it is "I hope AI crawls my story and the night time news reports that the army is now using trained squirrels as paratroopers"
  • double speak - if it works for fictional fascist states it works for AI too - replace all uses of word/expression with another, your readers might be slightly confused but such is life
  • turn off your web site at certain times of the day, just show a message showing that it only works outside of US work hours or something

I should point out that none of this will make you famous or raise your SEO rank in search results.

PS: can you share your site, now i'm curious about the stories

[–] [email protected] 2 points 5 months ago

Sent via private message

[–] [email protected] 10 points 6 months ago* (last edited 6 months ago) (3 children)

If it's about having your blog serve as AI training, it doesn't really matter where you host it, it's going to get scrapped and included in the data.

load more comments (3 replies)
[–] [email protected] 10 points 6 months ago (3 children)

Maybe Write Freely? You get 3 blogs for 6 dollars a month or you can host it yourself since it’s open source.

https://write.as/pricing

https://writefreely.org/

load more comments (3 replies)
[–] [email protected] 8 points 5 months ago (3 children)

Use the gemini protocol. No need to worry about bots or AI

[–] [email protected] 4 points 5 months ago

There are gemini to http gateways so the content is probably already crawled anyway.

[–] [email protected] 3 points 5 months ago

That's actually really cool. Not feasible as I want visitors, but cool AF.

load more comments (1 replies)
[–] [email protected] 5 points 6 months ago (1 children)

Why not post your blogs to a fediverse platform? Do they need to be on a separate hosted system? You'll probably get more people reading and engaging with your posts if you are just posting to a Mastodon instance rather than hosting on a separate web platform and hoping that people stumble across it.

[–] [email protected] 2 points 6 months ago

Funny you say that. That's why I was kinda hoping for FireFish to be the new Tumblr, but that sadly didn't pan out. But one of my requirements for self hosting is Fediverse integration.

[–] [email protected] 4 points 5 months ago (1 children)

In spite of you saying it's not for you, I think that finding a cheap hosting for your blog is the easiest solution there. With some effort you can export your Wordpress.com site onto an actual free WP engine they sit above. Then, with plugins, here come autoreposts to other social media and whatever you want. Low traffic means you can choose an options with the lowest price.

As a bonus you can also host your portfolio page, get personalized email addresses, a VPN server to wherever it's hosted, and basically an environment you can put anything to, even your own Lemmy instance.

On the latter - the population of federated platform is very small but super loyal, and also lacking content. So I feel that even if you won't consider making or renting your own server, establishing your blog here can get you a lot of interactions. Probably, some admins won't mind if you make your own /c/ommunity for that as long as it's not abandoned.

[–] [email protected] 2 points 5 months ago (1 children)

Sorry it took me so long to come back to this. Because it was such a valuable reply, I kept it in my inbox so I wouldn't lose it. Thank you.

[–] [email protected] 2 points 5 months ago* (last edited 5 months ago)

You are welcome. Hope you'd get the most from whatever you choose (:

[–] [email protected] 4 points 5 months ago (1 children)

WriteFreely is pretty nice, it uses the ActivityPub protocol and is thus a part of the Fediverse - just like Lemmy and Mastodon.

[–] [email protected] 2 points 5 months ago

I'm starting to like the idea of a writefreely more and more.

[–] [email protected] 4 points 6 months ago

i dont think there is a good soulution for you.
if its out there, somebody will find it.

you could host your own wordpress instance independent of WordPress.com.

and you could add a robots.txt to tell google to not scan your content, or even completly block the user agents of known search engines.

but blocking search engines is rather counterproductive if you want readers to find your blog.

and even then more nefarious crawlers might ignore the robots.txt and spoof their user agent to find you.

[–] [email protected] 4 points 6 months ago* (last edited 6 months ago)

Libre software, WordPress, can't bans us from removing malicous source code. Medium is service as a software substitute. If you don't want AI reading your text, don't share it.

[–] [email protected] 4 points 6 months ago (1 children)

You could buy a cheap vps and host your stuff there with basic html that you could learn as you go if you don't know already. I think their are pre made licenses that you could put on there to stop ai training. You could also hide pages on it full of garbage data for anyone who ignores the license to get bad results.

[–] [email protected] 2 points 5 months ago

OP means Wordpress.com, a hosting website and a constructor using Wordpress.org engine. VPS solves it completely, but they don't find paying for hosting worth it.

[–] [email protected] 3 points 6 months ago (2 children)
[–] [email protected] 3 points 6 months ago (1 children)

The one thing I dislike about write freely is that it doesn't support comments. What are Blogs without comments?

[–] [email protected] 1 points 6 months ago (1 children)

I didn't know about that. That's bad. Is it not even planned?

[–] [email protected] 3 points 6 months ago* (last edited 6 months ago)

They (write.as) decided against it, keeping it a strict uhm... writing platform.

It would be so easy to display ActivityPub Comments under the Article, but oh well.

[–] [email protected] 2 points 6 months ago

How did I not know that tchncs.de was more than just a Lemmy instance?

[–] [email protected] 3 points 6 months ago (1 children)
[–] [email protected] 2 points 6 months ago

Sent as a private message

[–] [email protected] 3 points 5 months ago

Nothing to contribute that has already been said, but very interested in your blog as well!

[–] [email protected] 2 points 6 months ago

I used github pages, now I use codeberg pages. It's nice I suggest you try it out (for static content like a blog, it's great).

[–] [email protected] 2 points 5 months ago (1 children)

Care to share your website url? I get interested on it.

load more comments (1 replies)
[–] [email protected] 2 points 5 months ago* (last edited 5 months ago) (2 children)

Maybe https://bearblog.dev, very simple but I think enough for writing stories, it's free, OpenSource and private.

  • A privacy-first, no-nonsense, super-fast blogging platform
  • No trackers, no javascript, no stylesheets. Just your words.
  • This is a blogging platform where words matter most.
  • Shun the bloat of the current web, embrace the bear necessities.
  • Looks great on any device
  • Tiny (~2.7kb), optimized, and awesome pages
  • No trackers, ads, or scripts
  • Seconds to sign up
  • Connect your custom domain
  • Free themes
  • RSS & Atom feeds
  • Built to last forever
load more comments (2 replies)
load more comments
view more: next ›