X updates its terms to ban crawling and scraping - eviltoast

The new terms, which are effective from September 29, ban any kind of scraping or crawling without “prior written consent.”

NOTE: crawling or scraping the Services in any form, for any purpose without our prior written consent is expressly prohibited.

The previous version of the terms allowed crawling in accordance with robots.txt.

“NOTE: crawling the Services is permissible if done in accordance with the provisions of the robots.txt file, however, scraping the Services without our prior consent is expressly prohibited,” it read.

In the last few months, Twitter has also altered its robots.txt file — a file that gives instructions to robot crawlers about what parts of the site they are permitted to visit — to remove instructions for all crawler bots apart from Google.

In 2015, Twitter confirmed that it had a firehose deal in place with Google to surface tweets in search results. It is not clear if the nature or terms of that deal have changed under the new management.

    • IHeartBadCode@kbin.social
      link
      fedilink
      arrow-up
      19
      ·
      1 year ago

      Yeah that’s literally UNENFORCEABLE. We just had a case last year that indicated that you can scrap data from sites so long as the data being scrapped isn’t used for profit.

      Additionally, scrappers cannot be legally held to have agreed to the TOS. Just simply typing an address in and then receiving a page back doesn’t mean that anyone agrees to the TOS of the server that gave the page. For pretty much the same reason software couldn’t enforce the “if you don’t agree with the terms on the CD-ROM, then you cannot open the package the CD-ROM is in.” So just because X wrote that in their TOS has zero bearing on if they can actually enforce that through the court system, which likely that’s going to be a big NAH.

      That’s based off of the point of the gate-up/gate-down test given by the courts. If a normal person can find a random “tweet (are we still calling them that?)” by typing a URL, the gate is up, you cannot pick and choose who gets to enter. If you don’t want a random tweet being scrapped the gates must be down. That means nobody typing in a random URL can ever access that tweet, they have to go through the gate house to gain entry to the resource. But gates down means that no one is going to link to a tweet because when they click the link, instead of seeing the related information, they get handed a login page. Which X has been trying that and news outlets bitching that they’re not going to post tweets in their story if Musk is just going to block everyone.

      The thing that X could argue is that someone is using their tweets for “profit” which is exactly the case they’re trying with the ADL and the CCDH. They’re trying to argue that these not-for-profits are profiting off of convincing ad buyers to not buy ads. Which, if that sounds crazy, OH BOY IS IT. However, Musk’s lawyers have attempted to muddle the waters on what is "PROFIT". So grab some popcorn for that one.

      The thing is that, I get Musk wants to hold tight copyright on the tweets and not surface a lot to others who might use that data for who knows what purpose. BUT you cannot have cake and have eaten it as well. Musk doesn’t get the best of both worlds. He can put everything behind a wall and attempt to enforce his TOS, but that’s still not really go to go well for his ADL/CCDH case. Or he can surface the tweets for the Internet to read. But he cannot have both. We’ve settled that in courts and Congress hasn’t made any kind of motion in changing that standing.