How to design software to avoid problems: web form data processing
When answering this question, it is always necessary to ask the question of software evolution. Which part of the system is most likely to change, and which is likely to remain constant?
We will consider this issue with examples. Consider several tasks that regularly occur in more or less large systems. The reasons for the appearance of this type of task are beyond the scope of this post; here we dwell on the implementation.
The forms are similar: added one field, removed another, cut possible values to numbers, made multiple choices instead of the only ones, etc. From the point of view of representation - a minor change. How does the internal logic change?
The first thing that comes to mind when looking at simple forms is to map the object to the form, provide for each field of the form the corresponding field of the object. Upon receipt of the request, we write all the fields of the form in the corresponding fields of the object, validate, if there are errors, give them to the user with binding to the fields, and let them re-edit, if everything is ok, send them for further processing (how it will be processed - it depends on the task ), if errors occurred as a result of processing, they rendered errors (possibly together with the form).
We will develop the idea.
Let's look at the evolution of forms. As a rule, there are tendencies towards (any combinations are possible):
From here the following problems may arise:
What are the conclusions?
PS In the following topics we will consider report generation, analysis of "raw" data of network protocols and other problems at the request of the habrasociety.
We will consider this issue with examples. Consider several tasks that regularly occur in more or less large systems. The reasons for the appearance of this type of task are beyond the scope of this post; here we dwell on the implementation.
Web form data processing
The forms are similar: added one field, removed another, cut possible values to numbers, made multiple choices instead of the only ones, etc. From the point of view of representation - a minor change. How does the internal logic change?
The first thing that comes to mind when looking at simple forms is to map the object to the form, provide for each field of the form the corresponding field of the object. Upon receipt of the request, we write all the fields of the form in the corresponding fields of the object, validate, if there are errors, give them to the user with binding to the fields, and let them re-edit, if everything is ok, send them for further processing (how it will be processed - it depends on the task ), if errors occurred as a result of processing, they rendered errors (possibly together with the form).
We will develop the idea.
Let's look at the evolution of forms. As a rule, there are tendencies towards (any combinations are possible):
- Adding New Fields
- Delete optional fields
- Context change
- Change of compulsory filling (in both directions)
- Delete required fields (= change mandatory + delete optional field)
- Changing possible field values, including changing the type of values
From here the following problems may arise:
- New field added. The set of fields for previously identical forms can become completely different over time.
- An optional field has been deleted. Just delete the handlers for this parameter. It would be nice to clean the table in the database and the code for working with this field, but not critical.
- The field was required, became optional. Checks for null thoroughly spoil life.
- The field was optional, became mandatory. The problem is with existing data: how to process it if there is no value, but it is required for the operation?
- Changing possible field values, including changing the type of values. I think the problems are obvious, all the dependencies on the type of field need to be changed.
What are the conclusions?
- Doing NULL checks in the database is necessary only for guaranteed required fields. These are the fields without which the entry in the database loses its meaning. As a rule, these fields are few, and often they represent an alternative key (or a subset of it). If you do not believe it, try to consider introducing anonymous users (i.e. users without registration).
- If in a semantically similar form the set of fields changes greatly depending on the context (for example, the questionnaire format for different customers), it makes sense to introduce attribute logic where one entry in the database corresponds to one field value (and not to the whole set of form fields).
- If there are a lot of optional fields, and they are intended only to store any data, it also makes sense to use attributive logic - their set is subject to change in practice.
- If different logic is tied to any optional field in different parts of the system, it makes sense to consolidate it in one place. First, let it be a class with two or three methods - then their number can easily grow to 15-20. It will also help solve a potential problem in the case when the field becomes mandatory.
- To eliminate typing problems for each field, store its type, and always transmit the value of this field together with a link to meta-information about the field and its type. If you need to exchange with external systems, we introduce clear rules for serialization (for example, in a string, including in XML) and parsing error handling.
- Regular refactoring requires removing old code to work with fields that no longer exist. No comments with code blocks - we’ll leave the story for the version control system.
PS In the following topics we will consider report generation, analysis of "raw" data of network protocols and other problems at the request of the habrasociety.