TASK_RTF_NOTES in MS Project or RTF in MS SQL. How to defeat him and prepare cubes in SSAS

There is a rather painful task for programmers of user organizations of MS Project - getting notes of responsible persons. Notes are of considerable value (with the correct formulation of the control problem), because without the initial information the problems cannot be classified and the right decision cannot be made. They, notes, of course, should be displayed in reports.

From the point of view of the user, everything seems to be simple - the report is the report, but from a technical point of view, a lot of nuances and questions are opened. In this article, I present my decision based on some pieces of code scattered here and there, over the network, and I hope that it will be useful to my colleagues.

I do not pretend to be original, but for some reason I did not find the analogues of this solution, I had to assemble it myself. In addition, I’m not a very deep specialist in MS SQL, therefore, if there are sensible comments, please comment.


Notes are obviously stored in a field called TASK_RTF_NOTES. The field is filled in by the user when transmitting reports on tasks for which he is responsible. There seems to be no special need for markup tools, moreover, they, these tools, are hidden from the user, but my Indians, respected (though not understood), for some reason save this information in RTF format. Everything would be fine, but it is necessary to prepare reports, including not on one project, but on a heap at a time. A good tool for preparing reports is SSAS in conjunction, for example, with MS Excel.

Yes, in MS Project Professional, of course, there is a "report builder." But these are not cubes, this is a bunch of requests with output to the tablet. And to change the cuts on the go, even if you really want to, you will not succeed. And yet you will have to send this report to a bunch of people instead of having them take it at any given time.

That is, presumably, the scheme is something like this:
1) Extract the value of the field
2) Transfer the value to the cube dimension
3) Get a beautiful report in MS Excel (for example)

That is something like this:


But there it was.

The first surprise is that the field is of type image . What for? Unclear. Why I stopped on this trifle - it will be clear further. Converted to varbinary (MAX) , drove on.
The second surprise is, in fact, RTF. what to do with it? We climb into the network and find a bunch of tips that I have divided into 2 categories:

1) Use the RTFTextBox object.
The tip is right on MSDN! This is apparently the official position of MS in this regard! Even a vague doubt creeps in, but haven't this feature been specifically incorporated into MSP in order to be able to sell report builders?

2) Do not fool people with problems and use the TASK_NOTES field.
Wow. Especially if you continue to read. And further it is written that only trimming of the TASK_RTF_NOTES field is stored here.

And that’s all. There are more tips to parse RTF with regular expressions. But regular expressions in MS SQL are also rather absent than they are, to implement them you also need to add a “CLR environment function” ... What should I do?

I had to look for more. Now I already had more opportunities - because obviously you can’t do without addon, I thought and thought, and decided to write addon specifically for parsing RTF. Do you think I took advice number 1? Yes, that's right, I used it. But for some reason SQL chewed on my dll and spat it out with the phrase "I do not believe System.Drawing, it is crooked, poorly written and there are a lot of vulnerabilities in it." Funny, in MS, it turns out, they write dll curves!

Then I finally realized that RTF parsing could not be
done

here, and finally came across this place: NRTFTRee by Oliver Here, of course, not a word about TASK_RTF_NOTES. Yes, and comments in Spanish. However, here I saw a real opportunity to collect the source code of the parser, which was done.

I will omit the details on the part of the assembly without System.Drawing - quite a bit of imagination and understanding that in fact all these classes are not really needed for this task, but there are no irreplaceable ones.

An example of an SQL script that now really works:

SELECT TOP 1000 [TASK_UID]
      ,[TASK_NAME]
      ,[PROJ_UID]
      ,convert(varchar(max),[dbo].ConvertFromRTF(convert(varbinary(max),[TASK_RTF_NOTES])))
	  ,[TASK_NOTES]
  FROM [ProjectServer2010_Published].[dbo].[MSP_TASKS]
where not [TASK_RTF_NOTES] is null


To the question “WHY covert so many times ???” I will answer - for some reason, the CLR functions do not support the transfer of image and varchar (at least in 2005 MS SQL).

Yes, there is still a nuance - the encoding in our MSP for some reason is not 1251, but 1252 (but not always, because part of the projects was transferred), so the conversion function itself is written like this:
[SqlFunction(DataAccess = DataAccessKind.Read)]
        public static SqlBytes ConvertFromRTF(SqlBytes bytes)
        {
            if (bytes == null) return null;
            if (bytes.IsNull) return null;
            if (bytes.Length == 0) return null;
            MemoryStream strm = new MemoryStream(bytes.Value);
            byte[] buff = new byte[4096];
            int rd = strm.Read(buff, 0, buff.Length);
            RtfTree tree = new RtfTree();
            string s1 = Encoding.Default.GetString(buff);
            tree.LoadRtfText(s1);
            string text = tree.Text;
            SqlBytes res;
            if (s1.IndexOf("ansicpg1252") > -1)
            {
                res = new SqlBytes(Encoding.GetEncoding("Windows-1252").GetBytes(tree.Text));
            }
            else if (s1.IndexOf("ansicpg1251") > -1)
            {
                res = new SqlBytes(Encoding.GetEncoding("Windows-1251").GetBytes(tree.Text));
            }
            else
            {
                res = new SqlBytes(Encoding.Default.GetBytes(tree.Text));
            }
            return res;
        }


Links:
1. Original converter: NRTFTRee by Oliver
2. Archive with solution: LibRTFToVarchar

Also popular now: