Can I believe the code in the editor? bi-directional text
def maps(): print "maps maps maps" def spam(): print "Erasing everything..." print "done."
Do you know that if you look at the next line for a very long time, then only three words “spam” will remain there?
s = "spam" ,spam ,"spam" s()
Indeed, the first line is very unusual. In general, the malicious spam function will be executed as a result of this code.
View on ideone . (For those who do not know: there is a conclusion of the executed program below)
At the heart of our problem with bidirectional writing is the idea that in memory the text is always stored in the order it was written by the person. Including when writing from right to left, in which the text will be drawn in the usual direction.
The direction of rendering is determined automatically by the symbols belonging to a particular alphabet (Hebrew, for example) or, if it is a punctuation mark or number, then by more cunning rules, depending on the context.
RLO - formatting character , stands for right-to-left override . Changes the direction of writing to right-handed for characters with default-left-handed writing. (The standard says that it can be used to record such identifierswhen they consist of mixed Hebrew and English and, apparently, English inclusions are naturally read from right to left).
So. Thanks to him, we can get our charm:
s = "spam
", spam," spam "
s = "spam", spam, "spam"
PDF stands for pop directional formatting, discards the effect of the last RLO or its friends .
It is easy to guess that the interpreter will be indifferent to incomprehensible characters in string literals. But some editors, such as emacs *, Xcode, Kate, deploy the intermediate text exactly as the browser does.
* in the case of emacs, perhaps the behavior depends on the terminal. But in vim and nano there is no problem in the same terminal: both show only the RLO character code in the corresponding position.
About other uses in code
The RLO character is not whitespace, in addition, python swears at it as part of identifiers, which slightly limits the scope of application.
It has to be put in string literals, or in comments. At the same time, by the end of each line of the file, as a paragraph, the RLO action ends.
... and again a link to ideone .
upd: there is still such an option with Embedding and Mark
"Rm -rf echo", which actually only prints "rm -rf"
echo -e '\xe2\x80\x8f\xe2\x80\xaaecho \xe2\x80\xac\xe2\x80\x8f\xe2\x80\xaarm -rf \xe2\x80\xac\xe2\x80\x8f'
For some reason, bash ignores formatting characters at the beginning of the command, which opens up a lot of scope for mischief.