How Monitoring All Call Traces Can Detect and Prevent Data Exfiltration
Findadoctor.com Data Leakage
It was reported that information about 1.4 million US doctors was leaked (https://apisecurity.io/issue-79-1-4-million-doctor-records-scraped-using-api/) when bad actors appear to have taken advantage of a GitLab file upload vulnerability (https://about.gitlab.com/blog/2020/03/30/how-to-exploit-parser-differentials/).
The technical details of the vulnerability are quite complex, which is one reason why it went undetected long enough for bad actors to take advantage of it. To put it simply, the root cause is that a front-end service did not interpret API parameters the same way as the back-end service did.
Here is a synopsis of what happened. The website (findadoctor.com) uses GitLab. GitLab implements file uploads using the RESTful “PUT” method. The front-end proxy “git-workhorse” simply accepts a “PUT” of a file upload, which puts it into a /tmp directory, then makes another “PUT” API call with a “filepath” parameter pointing to the uploaded file under /tmp. The back end then extracts the file according to this “filepath” and puts it in the requester’s uploaded_file directory.
This mechanism alone is not the problem. What is problematic is that the “workhorse” also supports what is called “method override.” Method override, used widely by git-rails applications, allows the workhorse to switch a POST method to a PUT/DELETE when proxying the call to the back end. A bad actor could make a “POST” API call, which would typically be passed along to the back end with all of its parameters, while adding the override header and an explicit “filepath” parameter pointing to a victim’s file in the /tmp directory. Once this “POST” call is switched to a “PUT” with an unmodified “filepath” parameter, the unsuspecting back end git-rails application would interpret the “PUT” call with a specific “filepath” as a file uploaded by the requester and then fetch the victim’s file to put it under the “Uploaded_file” directory belonging to the bad actor. The end result is exfiltration of the victim’s file.
What We Can Do: Monitor All Call Traces
As API-driven micro-service architecture gains rapid adoption, “miscommunications” between services can lead to functional vulnerabilities that, once exploited, the consequence of which can be quite damaging. Worse, it is almost impossible for conventional tools to detect bad actor’s attempt to exploit gaps in implementations resulting in API calls spanning multiple services.
It is almost impossible for code scanning to detect this kind of sophisticated vulnerability. If one looks into the codes of both method override and file upload, the implementations themselves are entirely proper.
It is equally difficult for a front-end gateway to detect. The complicit “workhorse” service itself is meant to be a “gateway.” The gateway is functioning entirely up to the spec in this case.
The only way to detect such an abuse is to monitor all APIs involved in an end-to-end “API transaction” (from front-end to all back-end calls). An automated anomaly detection engine capable of correlating calls can detect the abnormal call pattern. In the case of the vulnerability in question, correlation can be done with requesters’ ID, which is passed down in all upstream calls. All legitimate file uploads should have “filepath” specified only in the calls between workhorse and the back end. In addition, there is no legitimate file upload that would exhibit a method override. By correlating front end calls with back end calls, such Broken Function Level Authorization (A3 as defined by OWASP API Security Top 10) can be detected and prevented.