Sphinx and MSsql: problems ... (fixed: there are no problems ;-)
Yesterday we tried to set up work between Sphinx and MSsql, confusion began to arise)
Software version: Sphinx 0.9.9-rc2
OS: Windows 2003 Server Standard Edition x86 SP2
DB: MSSQL Server 2008 Enterprise x86 SP1
The Collation database used is Cyrillic_General_CI_AS
With basic settings and there were no problems starting the service itself.
Settings:
- in the parent "source" used "type = odbc", respectively, indicated in "odbc_dsn" connectionstring, as well as "mssql_unicode = 1".
- in the heirs of the main "source" the corresponding "sql_query", "sql_query_range", "sql_range_step" are indicated, with this everything is clear.
- parent indexcontains the following: "docinfo = extern", "mlock = 0", "morphology = stem_enru", "charset_type = utf-8" and "html_strip = 0".
- heirs contain their "source" and "path".
The indexer process uses the default settings. The
searchd process also uses default settings with corrected paths.
The problem is that the indexing process on certain tables stalls on the nth row, while the rest of the tables are indexed normally.
I tried using the “poke method” to exclude fields from the list of indexed fields one by one. Thus, I found a field that cannot be indexed with type “varchar (MAX)” with Collation “database default”.
I also tried to use the substring of this field instead of the field itself, reached the value that allowed to finish the indexing process. But this option does not suit us.
The field stores text with html pieces, other special characters are not excluded.
The indexer process ends with an error, opening the “Event Viewer” section of the “Application” we find:
Event Type: Error
Event Source: Application Error
Event Category: (100)
Event ID: 1000
Date: 5/28/2009
Time: 12:50:10 PM
User: N / A
Computer: DEV
Description:
Faulting application indexer.exe, version 0.0.0.0, faulting module indexer.exe, version 0.0.0.0, fault address 0x0001413f.
For more information, see Help and Support Center at go.microsoft.com/fwlink/events.asp
Data:
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 46 61 69 6c ion Fail
0010: 75 72 65 20 20 69 6e 64 ure ind
0018: 65 78 65 72 2e 65 78 65 exer.exe
0020: 20 30 2e 30 2e 30 2e 30 0.0.0.0
0028: 20 69 6e 20 69 6e 64 65 in inde
0030: 78 65 72 2e 65 78 65 20 xer.exe
0038: 30 2e 30 2e 30 2e 30 20 0.0.0.0
0040: 61 74 20 6f 66 66 73 65 at offse
0048: 74 20 30 30 30 31 34 31 t 000141
0050: 33 66 3f
And on the computer having “Visual Studio 2008” including debug, the error is indicated on the line:
0041413F mov byte ptr [eax + ecx], 0
We assume that the field contains unreadable characters.
We ask you to help with advice on how to identify and how to resolve the described problem.
Thanks in advance.
Software version: Sphinx 0.9.9-rc2
OS: Windows 2003 Server Standard Edition x86 SP2
DB: MSSQL Server 2008 Enterprise x86 SP1
The Collation database used is Cyrillic_General_CI_AS
With basic settings and there were no problems starting the service itself.
Settings:
- in the parent "source" used "type = odbc", respectively, indicated in "odbc_dsn" connectionstring, as well as "mssql_unicode = 1".
- in the heirs of the main "source" the corresponding "sql_query", "sql_query_range", "sql_range_step" are indicated, with this everything is clear.
- parent indexcontains the following: "docinfo = extern", "mlock = 0", "morphology = stem_enru", "charset_type = utf-8" and "html_strip = 0".
- heirs contain their "source" and "path".
The indexer process uses the default settings. The
searchd process also uses default settings with corrected paths.
The problem is that the indexing process on certain tables stalls on the nth row, while the rest of the tables are indexed normally.
I tried using the “poke method” to exclude fields from the list of indexed fields one by one. Thus, I found a field that cannot be indexed with type “varchar (MAX)” with Collation “database default”.
I also tried to use the substring of this field instead of the field itself, reached the value that allowed to finish the indexing process. But this option does not suit us.
The field stores text with html pieces, other special characters are not excluded.
The indexer process ends with an error, opening the “Event Viewer” section of the “Application” we find:
Event Type: Error
Event Source: Application Error
Event Category: (100)
Event ID: 1000
Date: 5/28/2009
Time: 12:50:10 PM
User: N / A
Computer: DEV
Description:
Faulting application indexer.exe, version 0.0.0.0, faulting module indexer.exe, version 0.0.0.0, fault address 0x0001413f.
For more information, see Help and Support Center at go.microsoft.com/fwlink/events.asp
Data:
0000: 41 70 70 6c 69 63 61 74 Applicat
0008: 69 6f 6e 20 46 61 69 6c ion Fail
0010: 75 72 65 20 20 69 6e 64 ure ind
0018: 65 78 65 72 2e 65 78 65 exer.exe
0020: 20 30 2e 30 2e 30 2e 30 0.0.0.0
0028: 20 69 6e 20 69 6e 64 65 in inde
0030: 78 65 72 2e 65 78 65 20 xer.exe
0038: 30 2e 30 2e 30 2e 30 20 0.0.0.0
0040: 61 74 20 6f 66 66 73 65 at offse
0048: 74 20 30 30 30 31 34 31 t 000141
0050: 33 66 3f
And on the computer having “Visual Studio 2008” including debug, the error is indicated on the line:
0041413F mov byte ptr [eax + ecx], 0
We assume that the field contains unreadable characters.
We ask you to help with advice on how to identify and how to resolve the described problem.
Thanks in advance.