F # spoiled me or why I don’t want to write in C # anymore
I used to love very much C#
It was my main programming language, and every time I compared it with others, I was glad that I had chosen it at the time. Python and Javascript immediately lose to dynamic typing (if the concept of typing makes sense to apply to a javascript), Java is inferior to generics, lack of events, value-types resulting from this carousel with separation of primitives and objects into two camps and mirror classes-wrappers like Integer
, no wipe it and so on. In a word - C # is cool.
Separately, I note that I am now talking about the language itself and the convenience of writing code on it.
I do not take into account Tuling, the abundance of libraries and the size of the community, because each
of these languages is developed enough for industrial development to be comfortable in most cases.
And then, out of curiosity, I tried F #.
And what's in it?
I will be brief, in order of importance to me:
- Immunable types
- The functional paradigm turned out to be much stricter and more harmonious than what we today call the PLO.
- Types of sums, they are the same
Discriminated Unions
or marked associations. - Laconic syntax
- Computation expressions
- SRTP (Statically Allowed Type Parameters)
- By default, even reference types cannot be assigned
null
, and the compiler requires initialization when declaring. - Type inference or type inference
With null
everything is clear, nothing clogs the project code more than endless checks of return values like Task<IEnumerable<Employee>>
. So first, let's discuss immunity and conciseness at the same time.
Suppose we have the following POCO class:
publicclassEmployee
{
public Guid Id { get; set; }
publicstring Name { get; set; }
publicstring Email { get; set; }
publicstring Phone { get; set; }
publicbool HasAccessToSomething { get; set; }
publicbool HasAccessToSomethinElse { get; set; }
}
Simply, capaciously, nothing superfluous. It would seem, where laconic?
The corresponding F # code looks like this:
type Employee =
{ Id: Guid
Name: string
Email: string
Phone: string
HasAccessToSomething: bool
HasAccessToSomethingElse: bool }
Now there really is nothing superfluous. Useful information is contained in the keyword of the data type declaration, the name of this type, the field names and their data types. In the example from C # in each line there are unnecessary public
and { get; set; }
. In addition, in F # we got immunity and protection against null
.
Well, let's suppose that we can also organize immobility in C #, and public
write with short completion with autocompletion:
publicclassEmployee
{
public Guid Id { get; }
publicstring Name { get; }
publicstring Email { get; }
publicstring Phone { get; }
publicbool HasAccessToSomething { get; }
publicbool HasAccessToSomethinElse { get; }
publicEmployee(Guid id, string name, string email, string phone, bool hasAccessToSmth, bool hasAccessToSmthElse)
{
Id = id;
Name = name;
Email = email;
Phone = phone;
HasAccessToSomething = hasAccessToSmth;
HasAccessToSomethinElse = hasAccessToSmthElse;
}
}
Done! True, the amount of code has increased 3 times: we duplicated all fields twice.
Moreover, when a new field is added, we can forget to add it to the constructor parameters and / or forget to assign a value inside the constructor, and the compiler will not tell us anything .
In F #, when adding a field, you need to add a new field. Everything.
Initialization looks like this:
let employee =
{ Id = Guid.NewGuid()
Name = "Peter"
Email = "peter@gmail.com"
Phone = "8(800)555-35-35"
HasAccessToSomething = true
HasAccessToSomethinElse = false}
And if you forget one field, the code will not compile. Since the type is immutable, the only way to make a change is to create a new instance. But what if we want to change only one field? It's simple:
let employee2 = { employee withName = "Valera" }
How to do this in C #? Well, you know without me.
Add nested reference fields, and now your { get; }
guarantee is nothing - you can change the fields of this field. Should I mention the collection?
But do we really need this immunity?
I did not accidentally add two Boolean fields about accessing somewhere. In real projects, some service is responsible for access, and often he accepts a model at the input and mutates it, putting it down where necessary true
. And here I am in the next place of the program I get a model in which these Boolean properties are set in false
. What does it mean? The user does not have access or just the model is not even drove through the service acces? Or maybe they drove out, but they forgot to initialize some fields? I don't know, I have to check and read a bunch of code.
When the structure is immutable - I know that there are actual values, because the compiler obliges me to completely initialize the declaration object .
Otherwise, when adding a new field, I must:
- Check all the places where this object is created - it is possible that you also need to fill this field there.
- Check for relevant services that mutate this object.
- Write / update unit tests affecting this field
Update mappings
Also, you can not be afraid that my object mutates inside someone else's code or in another thread.But in C # it is so difficult to achieve real immunity that writing such code is simply unprofitable, immobility at such a price does not save development time.
Well, enough about immunity. What else do we have? In F #, we also received for free:
- Structural equality
- Structural comparison
Now we can use the following constructions:
if employee1 = employee2 then//...
And it really will check the equality of objects. Equals
which checks the equality of the link is not needed by anyone , we already have it Object.ReferenceEquals
, thanks.
Someone may say that nobody needs it, because we don’t compare objects in real projects, therefore Equals
& GetHashCode
we need it so rarely that we can override with pens. But I think that the causal link here works in a fraternal direction - we don’t compare objects, because it ’s too expensive to redefine everything and support it. But when it comes for free , the application is instantaneous: you can use your models directly as keys in dictionaries, add models to HashSet<>
& SortedSet<>
, compare objects not according to your ID (although this option is, of course, available), but simply compare.
Discriminated Unions
I think most of us imbibed with the milk of the first timlid the rule that it is bad to build logic on execons. For example, instead of the try { i = Convert.ToInt32("4"); } catch()...
correct use int.TryParse
.
But in addition to this primitive and nauseated example, we constantly violate this rule. User entered invalid data? ValidationException
. Out of bounds array? IndexOutOfRangeException
!
In smart books they write that exceptions are needed for exceptional situations, unpredictable, when something went completely wrong and there is no point in trying to continue working. A good example - OutOfMemoryException
, StackOverflowException
, AccessViolationException
etc. But getting out of the array is unpredictable? Seriously? The indexer accepts input Int32
, the set of valid values of which is 2 to 32 degrees. In most cases, we work with arrays whose length does not exceed 10,000. In rare cases, a million. That is Int32
, there are much more values that will cause an exception than those that will work correctly, that is, if a randomly selected inte is more statistically more likely to get into an “exceptional” situation!
The same with validation - the user entered the curve data. What a surprise.
The reason why we actively abuse exceptions is simple: we lack the power of the type system to adequately describe the scenario "if everything is fine, give the result, if not, return an error." Strong typing obliges us to return the same type in all branches of the method execution (fortunately), but it was still not enough to add string ErrorMessage
& to each type bool IsSuccess
. Therefore, in the realities of C # exceptions - perhaps the lesser of the evils in this situation.
Again, you can write a class
publicclassResult<TResult, TError>
{
public bool IsOk { get; set; }
publicTResultResult { get; set; }
publicTErrorError { get; set; }
}
But here we again have to write a bunch of code, if we want, for example, to make an invalid state impossible. In a primitive implementation, you can assign both the result and the error, and forget to initialize IsOk
, so there will be more problems from this than good.
In F #, things like this are made easier:
typeResult<'TResult, 'TError> =
| Ok of 'TResult
| Error of 'TErrortypeValidationResult<'TInput> =
| Valid of 'TInput
| Invalid of string list
let validateAndExecute input =
match validate input with // проверяем результат функции валидации
| Valid input -> Ok (execute input) // если валидно - возвращаем "Ок" с результатом
| Invalid of messages -> Error messages // если нет, возвращаем ошибку со списком сообщений
No exceptions, everything is concise, and most importantly, the code is self-documented. You do not need to write in the xml doc that the method throws some kind of exception, you do not need to frantically wrap up the call to someone else's method try/catch
just in case. In such a type system, an exception is a truly unpredictable, wrong situation.
When you throw exceptions right and left, you need non-trivial error handling. Here you have a class BusinessException
or ApiException
, now you need to spawn exceptions inherited from them, make sure that they are used everywhere, and if you confuse something, instead of, for example, 404
or the 403
client will receive 500
. You will also find a tedious log analysis, reading a stack of traces, and so on.
The F # compiler throws a worning, if we have match
not sorted out all the possible options. Which is very convenient when you add a new case to DU. In DU, we define workflow, for example:
type UserCreationResult =
| UserCreated of id:Guid| InvalidChars of errorMessage:string| AgreeToTermsRequired| EmailRequired| AlreadyExists
Here we immediately see all the possible scenarios for this operation, which is much clearer than the general list of exceptions. And when we added a new case AgreeToTermsRequired
in accordance with the new requirements, the compiler threw the Vorning where we are processing this result.
I have never seen projects use such a visual and descriptive set of exceptions (for obvious reasons). As a result, the scripts are described in the text messages of these exceptions. In such an implementation, duplicates appear, and, conversely, developers are lazy to add new messages, instead making existing ones more general.
The array indexing is now also very concise, no if/else
length checks either :
let doSmth myArray index =
match Array.tryItem index myArray with
| Some elem -> Console.WriteLine(elem)
| None -> ()
The standard library type is Option:
typeOption<'T> =
| Some of 'T
| None
Each time you use it, the code itself tells you that the absence of a value here is possible according to logic , and not because of a programmer's error. And the compiler will throw a worm if you forget to handle all possible options.
The severity of the paradigm
Pure functions and expression-based language design enable us to write very stable code.
Pure function meets the following criteria:
- The only result of her work is the calculation of the value. It does not change anything in the outside world.
- A function always returns the same value for the same argument.
Add to this the totality (the function can correctly calculate the value for any possible input parameter) and you get a thread-safe code that always works correctly and is easy to test.
Expression-based design tells us that everything is an expression, everything has an output. For example:
let a = if someCondition then1else2
The compiler will make us take into account all possible combinations, we can not just stop at if
, forgetting about else
.
In C #, this usually looks like this:
int a = 0;
if(someCondition)
{
a = 1;
}else{
a = 2;
}
Here you can easily lose one branch in the future, and it a
will remain with the default value, that is, another place where the human factor can play.
Of course, on some pure functions you will not go far - we need I / O, at least. But these unclean effects can be severely limited to user input and work with data warehouses. Business logic can be implemented on pure functions, in which case it will be more stable than Swiss watches.
Avoiding the usual OOP
Standard case: you have a service that depends on a couple of other services and the repository. Those services, in turn, may depend on other services and on their repositories. All this is twisted by a powerful DI framework into a tight sausage of functionality, given to a web api controller when requesting.
Each dependence of our service, which on average, say, from 2 to 5, like our service itself, usually has 3-5 methods, of course, most of which are completely unnecessary in each specific scenario. From all this spreading tree of methods, we need in each separate scenario usually 1-2 methods from each (?) Dependency, but we connect together the whole block of functionality and create a bunch of objects. And moki, of course. Where without them - we need to somehow test all this beauty. And here I want to cover the test method, but in order to call this method, I need an object of this service. To create it, I have to shove moki into it. The catch is to understand which mocks are not called up at all in my method, I don’t need them. Some are called, but only a couple of methods from them. Therefore, in each test I make a tedious setup of these mocks with return values and other tripe. Then I want to test the second script inthe same method . I am waiting for a new setup. Sometimes in tests for a method of the code it is more, than in the method. And yes, for each method I have to crawl into his guts and look at what dependencies I really need this time.
This is manifested not only in tests: when I want to use some kind of 1 service method, I have to satisfy all dependencies in order to create the service itself, even if half of them are not used in my method. Yes, the DI framework takes over, but all the same, all these dependencies need to be registered in it. This can often be a problem, for example, if some of the dependencies are in a different assembly, and now we need to add a link to it. In some cases, this can greatly spoil the architecture, and then you have to pervert with inheritance or allocate a common block to a separate service, thereby increasing the number of components in the system. Problems, of course, solved, but unpleasant.
In the functional paradigm it works a little differently. The coolest kid here is a pure function, not an object. And predominantly, as you already understood, immotable values are used here, and not mutable variables. In addition, the functions are perfectly composited, therefore, in most cases, we do not need service objects at all. The repository gets what you need from the database? Well, get it and pass the value itself into the service, not the repository!
A simple script looks like this:
let getReport queryData =
use connection = getConnection()
queryData
|> DataRepository.get connection // зависимость от коннекшна мы внедряем в функцию, а не в конструктор// и вот нам уже не нужно следить за lifestyle'ом зависимостей в огромном дереве
|> Report.build
For those not familiar with the operator |>
and currying, this is equivalent to the following code:
let gerReport queryData =
use connection = getConnection()
Report.build(DataRepository.getconnection queryData)
In C #:
public ReportModel GetReport(QueryData queryData)
{
using(var connection = GetConnection())
{
// Report здесь -- статический класс. В него компилируются F# модулиreturn Report.Build(DataRepository.Get(connection, queryData));
}
}
And since functions are perfectly composited, you can write something like this:
let getReport qyertData =
use connection = getConnection()
queryData
|> (DataRepository.getconnection >> Report.build)
Note, Report.build
there is nowhere easier to test . You do not need moki at all. Moreover, there is a framework FsCheck
that generates hundreds of input parameters and runs your method with them, and shows the data on which the method is broken. The benefits of such tests are incomparably greater, they really test your system for durability, unit tests tickle it rather uncertainly.
All you need to do to run such tests is to write a generator for your type once. What is better than writing mokov? The generator is universal, it is suitable for all future tests, and you do not need to know the implementation of anything in order to write it.
By the way, dependence on the assembly with repositories or with their interfaces is no longer needed. All assemblies operate on common types and depend only on it, and not on each other. If suddenly you decide to change, for example, EntityFramework to Dapper, the assembly with the business logic will not affect it at all .
Statically Resolved Type Parameters (SRTP)
It is better to show than to tell.
let inline square
(x: ^a when ^a: (static member (*): ^a -> ^a -> ^a)) = x * x
This function will work for any type for which the multiplication operator with the appropriate signature is defined. Of course, this also works with conventional static methods, not only with operators. And not only with static!
let inline GetBodyAsync x = (^a: (member GetBodyAsync: unit -> ^b) x)
open System.Threading.Tasks
type A() =
member this.GetBodyAsync() = Task.FromResult 1
type B() =
member this.GetBodyAsync() = async { return2 }
A() |> GetBodyAsync |> fun x -> x.Result // 1
B() |> GetBodyAsync |> Async.RunSynchronously // 2
We do not need to define the interface, write wrappers for foreign classes, implement the interface, the only condition is that the type has a method with a suitable signature! I don't know how to do this in C #.
Computation expressions
We considered an example with type Result
. Suppose we want to perform a cascade of operations, each of which returns this one Result
. And if at least one link in this chain returns an error, we want to stop the execution and return the error immediately.
Instead of writing endless
let res arg =
match doJob arg with
| Error e -> Error e
| Ok r ->
match doJob2 r with
| Error e -> Error e
| Ok r -> ...
We can write once
typeResultBuilder() =
member __.Bind(x, f) =
match x with
| Error e -> Error e
| Ok x -> f x
member __.Return x = Ok x
member __.ReturnFrom x = x
let result = ResultBuilder()
And use it like this:
let res arg =
result {
let! r = doJob arg
let! r2 = doJob2 r
let! r3 = doJob3 r2
return r3
}
Now on each line with let!
in case Error e
we return the error. If everything is all right, we will return at the end Ok r3
.
And you can do such things for anything , including even using custom operations with custom names. Rich room to build DSL.
By the way, there is such a thing for asynchronous programming, even two - task
& async
. The first is to work with the familiar tastes, the second is to work with Async
. This thing from F #, is different from the task because it has a cold start, it also has integration with Tasks API. You can build complex workflows with cascading and parallel execution, and run them only when they are ready. It looks like this:
let myTask =
task {
let! result = doSmthAsync() // суть как у await Tasklet! result2 = doSmthElseAsync(result)
return result2
}
let myAsync =
async {
let! result = doAsync()
let! result2 = do2Async(result)
do! do3Async(result2)
return result2
}
let result2 = myAsync |> Async.RunSynchronously
let result2Task = myAsync |> Async.StartAsTask
let result2FromTask = myTask |> Async.AwaitTask
File structure in the project
Since the records (DTO, models, etc.) are announced concisely and do not contain any logic, the number of files in the project significantly decreases. Domain types can be described in 1 file, types specific to a narrow block or layer can be defined in another file together.
By the way, in F # the order of lines of code and files is important - by default, in the current line you can only use what you have already described above. This is by design, and it's very cool, because it protects you from cyclical dependencies. This also helps with revisions - the order of the files in the project gives design errors: if a high-level component is defined at the very top, then someone has messed up with the dependencies. And this can be seen at a glance, and now imagine how much time it will take for you to find such things in C #.
For comparison, all the logic and domain types of the Snake game I have described in 7 files, all but one less than 130 lines of code.
Total
Having received all these powerful tools and getting used to them, you begin to solve problems faster and more elegantly. Most of the code, once written and once tested, always works . Writing again in C # is for me to abandon them and lose in productivity. It was as if I was returning to the last century — so I ran in comfortable sneakers, and now in sandals. Better than nothing, but worse than something. Yes, various features are slowly added to it - both pattern matching, and records will be delivered, and even nullable reference types.
But all this, firstly, is much later than in F #, secondly, it is poorer. Pattern matching without Discriminated unions & Record destruction - well, better than nothing. Nullable reference types - not bad, but Option
better.
I would say that the main problem of F # is that it is hard to “sell” it to the secretary.
But if all of you decide to study F # - it will be easy to be involved.
And the tests will be nice to write, and they will really be of much use. Property-based tests (those that I described in the FsCheck example) showed me several times design errors that QA would search for a very long time. Unit tests basically showed me that I forgot to update something in the test configuration. And yes, from time to time, they showed that I missed something somewhere in the code. In F #, the compiler handles this. Is free.