web

Gotta block ’em all

Robb Knight, a software developer who found that Perplexity was circumventing robots.txt to scrape websites it wasn’t supposed to, told 404 Media there are many cases where it’s hard to tell what a user agent does or who operates it. “What’s happening to people, including me, is copy-pasting lists of agents without verifying every agent is a real one,” he said. Knight added that the Wall Street Journal and many News Corp-owned websites are currently blocking a bot called “Perplexity-ai,” which may or may not even exist (Perplexity’s crawler is called “PerplexityBot.”)

Source: Websites are Blocking the Wrong AI Scrapers (Because AI Companies Keep Making New Ones)

The solution we used on this here blargh is simple: We blocked everyone on robots.txt:


User-agent: *
Disallow: /
Crawl-delay: 360

We also like to have a terminal window to look at what’s currently hitting the server and we block bots liberally. We’ve already blocked ahrefs.com and perplexitybot for being assholes without rate limits. Does this mean this here blargh will be that much harder to find? Yeah, but we don’t particularly care about it.

Closing quickly, for that matter.

Social networks are universally more restrictive than web pages but also more fun in significant ways, chief amongst them being that more people can participate. What if the rest of the web have that simplicity and immediacy, but without the centralization? What if we could start over?

Source: A clean start for the web – macwright.com

Mozilla is knowingly walking away from any of these options because they’re bitter they could not come to dominate the Web after Firefox helped bring about the downfall of Internet Explorer. Big Tech will not support a reimagining of what the web could be since it will mean less profits. Can’t have that in a capitalist society, now can we?

There’s hope now that the Servo engine is cut loose, but the time window to avoid having a technological cycle (about 30 years or so) be dominated by corporations is closing.

dafuq does RSS even mean, seriously

Source: How would I improve RSS? Three ideas (Interconnected)

We rely heavily on RSS to find things to read and keep up with The Noise on the Internet. We also tend to shun newsletters cos RSS is a much better tool for them and en’t nobody got time for yet more email.

We’re aware of other initiatives like JSON Feed but they require re-implementing RSS into something else. Maybe the solution is an evolution of what the protocol currently is?

Start with a proper name for the protocol though. Bonus points if someone figures it out how to make it recursive.

And a special version of Flash for games only?

Are we ready to revisit some of the ideas of the early web again? There are trends that suggest we might just have come full circle – and I like it.

Source: The Return of the 90s Web | Max Böck

The only sites that won’t have an RSS feed are those of corporate entities that explicitly depend on keeping people on their sites, like fb.

Hopefully some enterprising engineer at google has found the Google Reader source code and are bringing it back to life…

Yes, it is

http://alexanderperrin.com.au/paper/shorttrip/

Is it a game even if there’s no objective?

Flickr: here we go again

So in the future, when the revenue coming from paying members is small enough to ignore, and the advertising numbers come in below expectations (as they often do), my fear is that Yahoo will come to an almost inevitable business decision: To kill Flickr.

via The new Flickr: Goodbye customers, hello ads | TechHive.

I’ve been a Flickr user for… a long time (actually, I went to try and find out and it doesn’t tell you anymore) and a paying Flickr Pro member on and off for about 4 years. I joined in when it was riding high on the web 2.0 wave and stayed during the slog that were the years of Yahoo acquisition.

Now… this. They want fifty bucks just for not displaying ads. No other benefit but that. Sure, I like the new website design, and the new mobile app… but the underlying functionality of the site will be much downgraded now, and the mobile app still will not post to Twitter.

Hell, I’ve received mails from Flickr telling me to convert to a free account from my paid Pro account. They want to let go of a sure 25 dollars so they can put ads on my pages; ads that most likely will not pull 25 dollars — much less 50 dollars — in a year of service.

Like Mr. Powazek says, that gamble better pay off, although I’ll hedge my bet and say that in two years time, it won’t have and Flickr will be unceremoniously killed by the suits at Yahoo.

Once again, Flickr is being treated like a fucking database.

Security in WordPress

I’m not saying WordPress isn’t secure, but the perception seems to be

“WordPress is not secure”

It’s said in TechCrunch, it’s called out to Matt, JD of Get Rich Slowly had big trouble, and there are a lot of tips and tutorials. The Codex entry on Hardening WordPress is missing some stuff… but the perception keeps turning more and more negative. If it keeps up like this some other platform will come along claiming to everyone to be more secure than everyone else and a lot of people will migrate just because of that.

I feel to avoid this the focus of WordPress 2.7 should be security. We already have a stable and flexible platform to establish and maintain blogs, so now it must become a secure platform.

Seguridad en WordPress

No es que diga que WordPress no es seguro, pero siento que la percepción en general es:

“WordPress no es seguro”

Lo dicen en TechCruch, se lo reclaman a Matt, el de Get Rich Slowly tuvo broncas fuertes, y hay un chingo de tutoriales y tips por todos lados. A la entrada en el Codex acerca de como endurecer WP le hacen falta algunas cosas… pero ps la percepción sigue tornándose mas y mas negativa. De seguir así va a llegar alguna otra plataforma clamando a diestra y siniestras que es mas segura que los demás y muchos migraran solo por eso.

Siento que para evitar que esto suceda el énfasis de WordPress 2.7 debe ser la seguridad. Ya tenemos una plataforma estable y flexible para establecer y mantener blogs, por lo que ahora debe convertirse en una plataforma segura.

Fresh E-NEWS, including HTML

For those who know how to make webpages the source code below is going to look pretty stupid. For those who don’t know how to make pages the code is a demonstration on how to not make pages

Codigo fuente de la pagina en toda su gloria

Done laughing? Good… for the that source is for a frame to be displayed on the page. The final result is below

Codigo fuente de la pagina en toda su gloria

Don’t those two scrollbars rock your world? They da shiznit.

The really cool part is if I tell the people in charge of the page I could get fired for looking at things they don’t want me looking at :P

ENoticias frescas, incluyendo HTML

Para los que saben hacer paginas el codigo fuente en la imagen es obviamente una burrada. Para los que no saben hacer paginas el codigo es una demostracion de como no hacer paginas.

Codigo fuente de la pagina en toda su gloria

Ya pararon de reir? Bien… por que habran de saber que ese codigo es para un frame que sera mostrado en la pagina. El resultado final es el de abajo

Codigo fuente de la pagina en toda su gloria

Nomas chequense las dos scrollbars. Son la neta del planeta.

Lo chingon es que si les digo a los encargados de la pagina podrian correrme por andar viendo cosas que ellos no quieren que vea :P

RSS

This past Sunday I gave myself the task of finding a new RSS reader. The cause? Google Reader. Most of my RSS feeds get read on a laptop without a working internet connection. Google Gears helps… when it works.

I tried using the extension for a week and had varied results. Downloading feeds to the computer sometimes took 15 minutes. Sometimes it failed I had to take the computer closer to the wireless router and then try again; doing that fixed most of the conflicts. But it’s a drag having to move around just to have something to read at night.

From there we’ve got the issue of actually using Google Reader while the computer is offline. Everyone knows Firefox is a bit overweight and likes to grab all the RAM it can. Using it on a computer with 256MB RAM doesn’t help even though I’m not using any extensions, the simple fact of having the program open made the computer slower. Google Reader turned from a speedy machine into slow junk.

With that my search had begun… first I tried FeedDemon… but it wanted me to create an account with NewsGator to be even able to use the program. It didn’t last more than 5 minutes installed.

Next up I had BlogBridge… it likes its 100MB RAM, but runs quite nicely and doesn’t interrupt other programs. On my desktop computer I run the monitor at 1600×1200 so at the beginning text looked quite small… but since I didn’t read the instructions I didn’t figure out how to resize text :P so I left it alone and kep searching.

Then I tried RSSowl… hmm, it’s Java. It doesn’t have an “offline” mode. Competes with Firefox to see who eats the most RAM. Being Java the interface is not that “good” but it’s certainly useable; the main issue is the scrolling… it feels wrong. As I set about using it I realized it was missing that little something that makes you realize it’s an application you were missing.

Last up for the day was Thunderbird… I had never used it to read feeds. Given it is a mail client the methods to read RSS are the same as for mail. The first miss was the way to administer feeds… it’s too much work; from importing feeds — one unsorted bunch instead of folder-separated, per OPML markup — to managing the imported feeds. The last straw was its insistence on loading the full article page when it had the chance. This slows you down and breaks up your train of thought as it takes a bit to display pages with a lot of multimedia content (Kotaku, Make) and my efforts to disable this bore no fruit. The advantage it had over the other programs was having my calendar (Lightning) right there… but it’s pretty useless if it takes you longer to read less.

With these results the overall winner turned out to be BlogBridge. It wasn’t by much… but it’s something. After using it for a few hours I discovered how to resize text (ctrl-Plus) and the main usability problem disappeared. Next week I’ll put up a more detailed revision of my experience with it.

Scroll to Top