
CloudFlare ScrapeShield bypass in Java (Android)

At some point in time, I had to solve the problem described in the header. I thought for a long time whether to write the obvious, in the end I decided that it might be useful to someone.
The reason is quite trivial - as the author of the Android client for a very niche site , at the same time I am neither among its administrators nor among the co-founders. Thus, I am not aware of any decisions of the site management until their actual entry into force.
Not so long ago, a DDoS attack began on this site, and the administration turned on DDoS protection from CloudFlare. Accordingly, the client application, which had previously used standard authentication mechanisms via POST + Cookie, ceased to authorize users. Communication with the administration did not lead to anything - "what can we do, it’s better without mobile clients, than not at all."
Naturally, all this began to affect the ratings and gave rise to very unflattering reviews.
The solution was to bypass CloudFlare protection by simulating the behavior of the browser on their page. CloudFlare in this particular case uses a hash, a key and random javascript code that the browser executes (calculating several arithmetic operations that look like obfuscated garbage) and later sends the resulting number along with the hash and key to the verification page. Our task, therefore, is to intercept the javascript task, solve it in any way, and ask if our guess is correct. If yes, we get a bun (cf_clearance cookies). If not, we get 503.
Rummaging in the search engine, there was exactly one link leading to the projectdoing something very similar. Written in Python using node.js or another compatible provider for PyExecJS. With all due respect to Python, using it in a lightweight niche application was an unjustifiable luxury that could have taken hours to integrate. A strategic decision was made to rewrite the solver code in Java.
Some notes / ambiguities that occurred while writing the code:
- Mozilla Rhino was chosen as the JS provider, providing, with disabled optimizations, a Dalvik-bytecode-compatible interface.
- UserAgent inherent to automatic requests are rejected with Error 503. Any "Java / 1.5.0_08", "libcurl-agent / 1.0" and similar lines are instantly rejected. Before you try anything, disguise yourself as a UserAgent of a modern browser.
- An implementation from Apache was used as the Http client. I trust her more than HttpURLConnection, which is promoted by Android developers, but this is a matter of taste. You can use any compatible implementation, for example, OkHttpClient
- Important: if you want to later display some data from the site in WebView, there are two things to consider:
- The http client must have exactly the same UserAgent as the webview (use settings.userAgentString on the webview)
- After receiving the cf_clearance cookie, you need to synchronize it with the WebView (example code below)
The final version is below. It is whipped together, but gives a basic idea of how everything works.
private final static Pattern OPERATION_PATTERN = Pattern.compile("setTimeout\\(function\\(\\)\\{\\s+(var t,r,a,f.+?\\r?\\n[\\s\\S]+?a\\.value =.+?)\\r?\\n");
private final static Pattern PASS_PATTERN = Pattern.compile("name=\"pass\" value=\"(.+?)\"");
private final static Pattern CHALLENGE_PATTERN = Pattern.compile("name=\"jschl_vc\" value=\"(\\w+)\"");
abstract public HttpResponse getPage(URI url, HashMap headers) throws IOException;
abstract public CookieStore getCookieStore();
public boolean cloudFlareSolve(String responseString) {
// инициализируем Rhino
Context rhino = Context.enter();
try {
String domain = "www.example.com";
// CF ожидает ответа после некоторой задержки
Thread.sleep(5000);
// вытаскиваем арифметику
Matcher operationSearch = OPERATION_PATTERN.matcher(responseString);
Matcher challengeSearch = CHALLENGE_PATTERN.matcher(responseString);
Matcher passSearch = PASS_PATTERN.matcher(responseString);
if(!operationSearch.find() || !passSearch.find() || !challengeSearch.find())
return false;
String rawOperation = operationSearch.group(1); // операция
String challengePass = passSearch.group(1); // ключ
String challenge = challengeSearch.group(1); // хэш
// вырезаем присвоение переменной
String operation = rawOperation
.replaceAll("a\\.value =(.+?) \\+ .+?;", "$1")
.replaceAll("\\s{3,}[a-z](?: = |\\.).+", "");
String js = operation.replace("\n", "");
rhino.setOptimizationLevel(-1); // без этой строки rhino не запустится под Android
Scriptable scope = rhino.initStandardObjects(); // инициализируем пространство исполнения
// either do or die trying
int result = ((Double) rhino.evaluateString(scope, js, "CloudFlare JS Challenge", 1, null)).intValue();
String answer = String.valueOf(result + domain.length()); // ответ на javascript challenge
final List params = new ArrayList<>(3);
params.add(new BasicNameValuePair("jschl_vc", challenge));
params.add(new BasicNameValuePair("pass", challengePass));
params.add(new BasicNameValuePair("jschl_answer", answer));
HashMap headers = new HashMap<>(1);
headers.put("Referer", "http://" + domain + "/"); // url страницы, с которой было произведено перенаправление
String url = "http://" + domain + "/cdn-cgi/l/chk_jschl?" + URLEncodedUtils.format(params, "windows-1251");
HttpResponse response = getPage(URI.create(url), headers);
if(response.getStatusLine().getStatusCode() == HttpStatus.SC_OK) { // в ответе придёт страница, указанная в Referer
response.getEntity().consumeContent(); // с контентом можно делать что угодно
return true;
}
} catch (Exception e) {
return false;
} finally {
Context.exit(); // выключаем Rhino
}
return false;
}
private void syncCookiesWithWebViews() {
List cookies = getCookieStore().getCookies();
CookieManager cookieManager = CookieManager.getInstance(); // CookieManager служит для синхронизации cookies между WebView
for (Cookie cookie : cookies) {
String cookieString = cookie.getName() + "=" + cookie.getValue() + "; domain=" + cookie.getDomain();
cookieManager.setCookie("diary.ru", cookieString);
}
}
The client code is published under GPLv3, so most likely CloudFlare will also find out about it, which will lead to a change in the algorithm. Nevertheless, I am not a supporter of the principle of security by obscurity and I managed to solve the problem of letting mobile users before the DDoS decline.
Thanks for attention. Questions / comments in the comments.