
Illusions of XML / XSLT Technologies
Sometimes in the computer world there are bursts of interest in a particular technology. Bursts are not random, but clearly supported by the manufacturers of these technologies. This is not surprising, because it is difficult to sell the same thing, it is easier to sell something new or old, but named differently. Nothing sells better than a feature not found in the previous model. Why is the consumer so arranged? And the opinion of the consumer is trite exploited, he simply imposed a desire. Large software manufacturers very often exhaust the market and need a constant change of technology in order to sell updates and simply increase the price of programs. Well, it’s easier to rebuild from competitors, assuring that we have the best and latest technologies.
This is what happened with XML. XML, after all, is basically nothing new. XML is a simplified subset of the languageSGML , which dates back to IBM’s 1960 GML. XML, in fact, simply standardized the information exchange format and that’s it.
But a miracle happened, we got XML and an object for advertising appeared and manufacturers began to declare on every corner that they already had databases with XML and, in general, everything was saturated with XML.
Why look far back, here it is AJAX and here it is an object for advertising and sales. Now only the lazy one does not say that he is with AJAX.
In general, this is how the world works. Not all new technologies are bad and you do not have to take them with hostility. But we need to be able to choose tools for certain decisions and correctly make these decisions for our business or for solving our business problems.
In his article "Programming as an Art"I have already mentioned the enthusiasm of programmers for technological fashion. And as I said, I am a programmer myself and it was common for me and my colleagues to get involved. And in this article I will talk about our mistakes related to the use of XML / XSLT technologies. Perhaps this experience will be useful to someone and help him find the right place to use these technologies. Although it can be firmly said that this article will cause great criticism in my address.
Our company has developed several software products. But the first products did not find a big response in the market, as they contained a number of technological and marketing errors.
One such product is Bitrix: Info Portal. One of the most sophisticated products we have created. Designed for the Microsoft platform based on ASP technologies with the MSSQL database, it was designed to create web solutions on the MS platform.
In the first version of this product (in 1999 or 2000, it seems), we created our own self-written template engine in which the instructions were in the correct tags and looked something like #DATA () #. It had simple conditions and cycles: Well, who did not go through this? Yes, all those who once made the system, which was supposed to be passed on to someone for use and configuration, made their own template engine.
After the introduction of the first projects, dissatisfaction with such a template engine came. All the time you need to add functions, you need to teach developers your language, there are no normal debugging and development tools: Well, in general, nonsense. I am always amazed when I still meet products that use their template engines. It's like picking your nose with your toe, it’s uncomfortable and you don’t reach the goal. There were options for using someone else's template engines, but it was a change of sewn to soap, we did not accept this option and went to look for an industrial solution.
Indeed, many books and textbooks have been written in which programmers and designers were taught that the best way to create a template engine or abstract the appearance of data is to drive everything into XML, then pass it through XSLT and get the HTML out. Those. XSLT will be your best template engine for you.
It was argued that projects with XSLT templates should be simpler and more convenient than any other templates and it would be more convenient to manage them. Much has been said about the safety, portability and ease of development of such projects.
Everyone took it literally and started making similar products. And of course, we also heard enough and believed that our future is XML / XSLT technologies.
They made a feat, making XSLT templates work fast enough, put a lot of effort, time and money into the development of technology ... The largest product catalogs contained 70 thousand products.
But the clients voted in rubles and clarified.
What we got as a result:
1. The illusion of the simplicity of XSLT templates.
Projects using XML / XSLT proved to be very difficult to support for customers and our partners. The cost of ownership of such projects is very high and tends to increase as the project grows. There are very few XSLT specialists. Technological template can be corrected only by a specialist of sufficiently high qualification. Using XPATH is also not particularly convenient to select data. Thus, the illusion about simplicity for clients and convenience in project management was dispelled.
2. The illusion of controllability and flexibility.
XSLT templates for the most part are not enough to write serious business logic in the public part of the site. Why shift serious business logic to an XSLT template? Yes, it turns out in some applications that the data is the same, but it depends on the user or a number of conditions what and in what form he will see. And this is a template and it needs logic. XSLT did not develop as a full-fledged programming language; only simple conditional representations and limited logic can be made on it. There is no way to use the full potential of modern programming languages and libraries (graphics, presentation, service functions, etc.).
3. The illusion of performance, low cost placement and scalability.
As the DEVELOPERS do not try, the performance of XML / XSLT systems remains very low, despite all the efforts of the industry. Yes, and how to squeeze this performance? First, the data from the SQL database is converted to XML (and this is a large text file due to its structure). Then the XML data is loaded into the XML parser already in the server part, where it takes up even more memory for XPATH to work, the formation of indexes on XML data at the time of loading, etc. Further, XSLT passes through a huge array of data, receiving the output, again, the text that takes up memory.
The real performance solution is visible only in multi-level caching, which is not always possible, or undesirable, or simply expensive, both in development and in use.
Sometimes they argue, they say, why do you need a lot of data in XML? Well, to make a site template or just a forum, for example, on XSLT, in XML there should be almost all the data: user authorization, statistics, catalog, product or article branches: Some solution lies in the field of managing and requesting the necessary data in XML directly from XSLT. But still, you have to extract significantly more data, you need more conditions, you have to do more memory and queries, and the template itself becomes more complex and actually becomes a full-fledged software application, which kills the original goal, to make the template simple.
Well, and as a result, projects on this technology are very difficult to place on regular hosting services, almost always these are dedicated machines. The need for scaling up such projects arises rather quickly and requires significant financial efforts.
4. The illusion of the convenience of abstracting data and appearance.
One of the benefits that everyone is chasing using XML / XSLT technology is to achieve a high-quality abstraction of appearance. But only after the creation of the first templates, everyone understands that the abstraction turned out, only nobody needs it anymore. The XSLT template is very complex to represent the appearance, especially of a modern AJAX application. Correction of such a template requires a lot of effort. A complete change of design requires a complete rewrite of all the templates, which, given the complexity of creating XSLT, is even more expensive. Thus, it turns out that the price of ownership of XML / XSLT technology is very high for both developers and their customers.
5. Inability to create libraries and reuse work results
Created templates or an XSLT application cannot be combined into a library in order to simplify work with a typical operation in the next template. Programmers are extremely limited in transferring their work to their colleagues. Identified errors have to be tracked in several templates.
6. Non-linear increase in development complexity during application development
This is really strange, but if necessary, dynamically develop the application, scale it, the complexity of developing applications with XSLT grows completely disproportionate to the functionality received. Applications are hard to debug. All this also leads to the need to involve more and more qualified programmers in the work.
And what comes of the heroic efforts of developers to master the technology?
Large costs of ownership and development, ever-increasing costs of scaling, low demand for products, expensive services, moving to an expensive price group, reducing the number of orders, slowing down the pace of product development, lack of a working affiliate network ...
Does this mean that you need to completely abandon XML? Of course not. XML / XSLT is a very beautiful technology (programming phrase, right?) And will continue to evolve. XML works great for communication between projects (RSS, Yandex.Market, CommerceML and others). For small templates and fragments, XSLT is also used quite effectively today for XML processing.
.NET fully supports XML in all its manifestations. XSLT transformation is also supported: if necessary - use, not necessary - no. But nobody insists on using XML / XSLT.
There are even very vivid examples when the largest projects use XML / XSLT. As far as I know, Yandex uses this combination for its services. And for me, the best result of efficiency is the financial result. I will not argue, but I will assume that the cost of development in relation to the development speed for Yandex is quite high, obviously, within the acceptable range for this project, but still not as comfortable as we would like. But this is just my guess. I do not draw analogies with Yandex, this is inappropriate. We made a circulation product, and they do internal services for themselves and have the ability to use all the optimization and configuration tools for the application, serve it only with their team.
Of course, it is always tempting to say that the MSXML we selected was bad and that there are new and better parsers, more optimal platforms. Maybe. But this is not the case, the matter is in the economics of the project and in the reasons that I indicated above. As a result, we came to the conclusion that the use of XML / XSLT in a replicated product was unreasonably expensive and impractical.
What is this whole article for? Choose your tools deliberately, do not be guided only by fashion and the assurance of technology developers about its incredible effectiveness. If you are making a run product like us, consider the economies of partners and customers. If to use the product the developer will need to know the platform programming language (ASP, JAVA, PHP ...) and still use XSLT in templates and XPATH to search for data, this alone can make the project unprofitable for use, not to mention a host of others reasons.
In general, no one canceled his head.
This is what happened with XML. XML, after all, is basically nothing new. XML is a simplified subset of the languageSGML , which dates back to IBM’s 1960 GML. XML, in fact, simply standardized the information exchange format and that’s it.
But a miracle happened, we got XML and an object for advertising appeared and manufacturers began to declare on every corner that they already had databases with XML and, in general, everything was saturated with XML.
Why look far back, here it is AJAX and here it is an object for advertising and sales. Now only the lazy one does not say that he is with AJAX.
In general, this is how the world works. Not all new technologies are bad and you do not have to take them with hostility. But we need to be able to choose tools for certain decisions and correctly make these decisions for our business or for solving our business problems.
In his article "Programming as an Art"I have already mentioned the enthusiasm of programmers for technological fashion. And as I said, I am a programmer myself and it was common for me and my colleagues to get involved. And in this article I will talk about our mistakes related to the use of XML / XSLT technologies. Perhaps this experience will be useful to someone and help him find the right place to use these technologies. Although it can be firmly said that this article will cause great criticism in my address.
Our company has developed several software products. But the first products did not find a big response in the market, as they contained a number of technological and marketing errors.
One such product is Bitrix: Info Portal. One of the most sophisticated products we have created. Designed for the Microsoft platform based on ASP technologies with the MSSQL database, it was designed to create web solutions on the MS platform.
In the first version of this product (in 1999 or 2000, it seems), we created our own self-written template engine in which the instructions were in the correct tags and looked something like #DATA () #. It had simple conditions and cycles: Well, who did not go through this? Yes, all those who once made the system, which was supposed to be passed on to someone for use and configuration, made their own template engine.
After the introduction of the first projects, dissatisfaction with such a template engine came. All the time you need to add functions, you need to teach developers your language, there are no normal debugging and development tools: Well, in general, nonsense. I am always amazed when I still meet products that use their template engines. It's like picking your nose with your toe, it’s uncomfortable and you don’t reach the goal. There were options for using someone else's template engines, but it was a change of sewn to soap, we did not accept this option and went to look for an industrial solution.
Indeed, many books and textbooks have been written in which programmers and designers were taught that the best way to create a template engine or abstract the appearance of data is to drive everything into XML, then pass it through XSLT and get the HTML out. Those. XSLT will be your best template engine for you.
It was argued that projects with XSLT templates should be simpler and more convenient than any other templates and it would be more convenient to manage them. Much has been said about the safety, portability and ease of development of such projects.
Everyone took it literally and started making similar products. And of course, we also heard enough and believed that our future is XML / XSLT technologies.
They made a feat, making XSLT templates work fast enough, put a lot of effort, time and money into the development of technology ... The largest product catalogs contained 70 thousand products.
But the clients voted in rubles and clarified.
What we got as a result:
1. The illusion of the simplicity of XSLT templates.
Projects using XML / XSLT proved to be very difficult to support for customers and our partners. The cost of ownership of such projects is very high and tends to increase as the project grows. There are very few XSLT specialists. Technological template can be corrected only by a specialist of sufficiently high qualification. Using XPATH is also not particularly convenient to select data. Thus, the illusion about simplicity for clients and convenience in project management was dispelled.
2. The illusion of controllability and flexibility.
XSLT templates for the most part are not enough to write serious business logic in the public part of the site. Why shift serious business logic to an XSLT template? Yes, it turns out in some applications that the data is the same, but it depends on the user or a number of conditions what and in what form he will see. And this is a template and it needs logic. XSLT did not develop as a full-fledged programming language; only simple conditional representations and limited logic can be made on it. There is no way to use the full potential of modern programming languages and libraries (graphics, presentation, service functions, etc.).
3. The illusion of performance, low cost placement and scalability.
As the DEVELOPERS do not try, the performance of XML / XSLT systems remains very low, despite all the efforts of the industry. Yes, and how to squeeze this performance? First, the data from the SQL database is converted to XML (and this is a large text file due to its structure). Then the XML data is loaded into the XML parser already in the server part, where it takes up even more memory for XPATH to work, the formation of indexes on XML data at the time of loading, etc. Further, XSLT passes through a huge array of data, receiving the output, again, the text that takes up memory.
The real performance solution is visible only in multi-level caching, which is not always possible, or undesirable, or simply expensive, both in development and in use.
Sometimes they argue, they say, why do you need a lot of data in XML? Well, to make a site template or just a forum, for example, on XSLT, in XML there should be almost all the data: user authorization, statistics, catalog, product or article branches: Some solution lies in the field of managing and requesting the necessary data in XML directly from XSLT. But still, you have to extract significantly more data, you need more conditions, you have to do more memory and queries, and the template itself becomes more complex and actually becomes a full-fledged software application, which kills the original goal, to make the template simple.
Well, and as a result, projects on this technology are very difficult to place on regular hosting services, almost always these are dedicated machines. The need for scaling up such projects arises rather quickly and requires significant financial efforts.
4. The illusion of the convenience of abstracting data and appearance.
One of the benefits that everyone is chasing using XML / XSLT technology is to achieve a high-quality abstraction of appearance. But only after the creation of the first templates, everyone understands that the abstraction turned out, only nobody needs it anymore. The XSLT template is very complex to represent the appearance, especially of a modern AJAX application. Correction of such a template requires a lot of effort. A complete change of design requires a complete rewrite of all the templates, which, given the complexity of creating XSLT, is even more expensive. Thus, it turns out that the price of ownership of XML / XSLT technology is very high for both developers and their customers.
5. Inability to create libraries and reuse work results
Created templates or an XSLT application cannot be combined into a library in order to simplify work with a typical operation in the next template. Programmers are extremely limited in transferring their work to their colleagues. Identified errors have to be tracked in several templates.
6. Non-linear increase in development complexity during application development
This is really strange, but if necessary, dynamically develop the application, scale it, the complexity of developing applications with XSLT grows completely disproportionate to the functionality received. Applications are hard to debug. All this also leads to the need to involve more and more qualified programmers in the work.
And what comes of the heroic efforts of developers to master the technology?
Large costs of ownership and development, ever-increasing costs of scaling, low demand for products, expensive services, moving to an expensive price group, reducing the number of orders, slowing down the pace of product development, lack of a working affiliate network ...
Does this mean that you need to completely abandon XML? Of course not. XML / XSLT is a very beautiful technology (programming phrase, right?) And will continue to evolve. XML works great for communication between projects (RSS, Yandex.Market, CommerceML and others). For small templates and fragments, XSLT is also used quite effectively today for XML processing.
.NET fully supports XML in all its manifestations. XSLT transformation is also supported: if necessary - use, not necessary - no. But nobody insists on using XML / XSLT.
There are even very vivid examples when the largest projects use XML / XSLT. As far as I know, Yandex uses this combination for its services. And for me, the best result of efficiency is the financial result. I will not argue, but I will assume that the cost of development in relation to the development speed for Yandex is quite high, obviously, within the acceptable range for this project, but still not as comfortable as we would like. But this is just my guess. I do not draw analogies with Yandex, this is inappropriate. We made a circulation product, and they do internal services for themselves and have the ability to use all the optimization and configuration tools for the application, serve it only with their team.
Of course, it is always tempting to say that the MSXML we selected was bad and that there are new and better parsers, more optimal platforms. Maybe. But this is not the case, the matter is in the economics of the project and in the reasons that I indicated above. As a result, we came to the conclusion that the use of XML / XSLT in a replicated product was unreasonably expensive and impractical.
What is this whole article for? Choose your tools deliberately, do not be guided only by fashion and the assurance of technology developers about its incredible effectiveness. If you are making a run product like us, consider the economies of partners and customers. If to use the product the developer will need to know the platform programming language (ASP, JAVA, PHP ...) and still use XSLT in templates and XPATH to search for data, this alone can make the project unprofitable for use, not to mention a host of others reasons.
In general, no one canceled his head.