ADDED README.fosspell.md Index: README.fosspell.md ================================================================== --- /dev/null +++ README.fosspell.md @@ -0,0 +1,471 @@ + + + + + +# Fosspell +*Spell check the Fossil SCM source code* + +
+
+
Part 1: Basic usage
+
+
+
Description
+
Prerequisites
+
+
+
BSD/Linux
+
MS Windows (MSYS2)
+
Mac OSX
+
+
+
Setup
+
Usage
+
Result
+
+
+
Part 2: Internals
+
+
+
Cache
+
Personal dictionaries
+
American English versus British English
+
False positives when spell checking source code
+
+
+
False positives, easy to handle
+
False positives, tricky to handle
+
+
+
Why use both `aspell` and `hunpell`?
+
Links
+
+
+ +
+ +## Part 1 : Basic usage top + +Read this section for basic usage of the `fosspell` script. + +### Description top + +This software checks the Fossil SCM source code for spelling errors, duplicated words, and trailing spaces. +The result, a list of filenames and words, could posted to the Fossil SCM mailing list. + +### Prerequisites top + +This software requires `perl` to run the script, and both `aspell` and `hunspell` for spell checking. +Both `aspell` and `hunspell` require an English dictionary each. +The dictionaries normally have to be installed separately. + +#### BSD/Linux top + +AFAIK, all BSD/Linux distributions come with Perl already installed. +All common distributiuons have packages for both `aspell` and `hunspell`. +The dictionary package names may vary among distributions: +`en-aspell` or `aspell-en`, `en-hunspell` or `hunspell-en`, etc. + +#### MS Windows (using `MSYS2`) top + +On `MSYS2`, `perl` has to be installed, as any other package. + +

+    pacman -S perl
+    pacman -S mingw-w64-x86_64-aspell
+    pacman -S mingw-w64-x86_64-aspell-en
+    pacman -S mingw-w64-x86_64-hunspell
+    pacman -S mingw-w64-x86_64-hunspell-en <--- MAY FAIL AS AN UNRECOGNIZED PACKAGE
+
+ +If the installation of the `hunspell-en` dictionary package fails, download and install the dictionaries +manually from : + + pacman -S wget + wget http://downloads.sourceforge.net/wordlist/hunspell-en_US-2016.06.26.zip + wget http://downloads.sourceforge.net/wordlist/hunspell-en_CA-2016.06.26.zip + wget http://downloads.sourceforge.net/wordlist/hunspell-en_GB-ise-2016.06.26.zip + pacman -S unzip + unzip hunspell-en_US-2016.06.26.zip + unzip hunspell-en_CA-2016.06.26.zip + unzip hunspell-en_GB-ise-2016.06.26.zip + mv en_GB-ise.aff en_GB.aff + mv en_GB-ise.dic en_GB.dic + mkdir -p /c/msys64/mingw64/share/hunspell + mv *.aff *.dic /c/msys64/mingw64/share/hunspell/ + +#### Mac OSX top + +Not tested. + +### Setup top + +1. Clone the Fossil repository: + + fossil clone http://www.fossil-scm.org/ fossil.fossil + +2. Clone this repository: + + fossil clone http://kuu.se/fossil/fosspell/ fosspell.fossil + +3. Open the Fossil repository: + + mkdir fossil + cd fossil + fossil open /path/to/fossil.fossil + +4. Open this repository (inside the Fossil repository): + + mkdir spell + cd spell + fossil open --nested /path/to/fosspell.fossil + +### Usage top + +Run + + ./fosspell COMMAND + +where COMMAND is one of + +- `all` +- `dup` +- `false` +- `help` +- `scan` +- `setup` +- `spc` +- `spell` +- `version` + +Run `./fosspell all` to check all the Fossil source code tree. +Run `./fosspell help` to get all the gory details. + +### Result top + +The resulting typos is stored in three text files, one for each type of typo: + + - `typo_spell.txt` : each entry contains: filename, line number word, misspelled word + - `typo_dup.txt` :    each entry contains: filename, paragraph (possibly multiline) with duplicated word + - `typo_spc.txt` :    each entry contains: filename, line number + +## Part 2 : Internals top + +This section covers the `fosspell` internals. +If you are only interested in basic usage, you can stop reading here. + +### Cache top + +`fosspell` uses the UNIX `file` utility to detect different type of files, to know if and how to spell check them. +This is somewhat time-consuming, and is normally done only once. +The result is stored in cache files. +Re-running `fosspell` will use the cache instead of running `file`. +The cache is updated only when `fosspell` detects new files in the Fossil source code tree. + +### Personal dictionaries top + +Two dictionaries are used together with `fosspell`: + +- `false.positives.easy.txt` + Words which are easy to classify as false positives. + The file format is one word per line. + This file can be used directly as a personal directory for spell checking: + `hunspell -p false.positives.easy.txt ...` + +- `false.positives.tricky.txt` + False positives, where each word has its own section of one or more lines. + Thius is the format: + + [notfound] + the --notfound option is used. + a "notfound:" tag to tell where to redirect if the particular repository requested + notfound: http://url-to-go-to-if-repo-not-found/ + + + +### American English versus British English top + +The Fossil source code contains spellings both in American English... + +`skins/xekri/css.txt:` /* example ticket *colors* */ + +... and in British English: + +`skins/black_and_white/css.txt:` /* consistent *colours* */ + +Using the `en_US` dictionary, *colours* is detected as a misspelled word: + + echo colours colors | hunspell -d en_US -l + colours + +Using the `en_GB` dictionary, *colors* is detected as a misspelled word: + + echo colours colors | hunspell -d en_GB -l + colors + +The trick to accept both spellings is to use both dictionaries: + + echo colours colors | hunspell -d en_US,en_GB -l + + + +### False positives when spell checking source code top + +Compared to a text written in a natural language, spell checking of source code inevitable detects a lot more of false positives. +Lots of sections in a source code should obviously be filtered out before the spell checking to take place. +For example, in a `.c` or a `.h` file, it only makes sense to spell check comments and strings. +Another example is the `.wiki` files, containing HTML tags, where the tags themselves should not be spell checked, only the literal strings. +Even so, there will be many false positives. +Source code is by nature full of special technical terms, not always included in a standard English dictionary. +For example, the word `SQL` is a known word to `hunspell`'s US English dictionary, but unknown to `aspell`: + + echo SQL | hunspell -l -d en_US + + + echo SQL | aspell list --lang=en_US + SQL + +The example above is an example of a false positive, which can easily be fixed by adding the word to a personal dictionary: + + echo personal_ws-1.1 en 0 > my.false.positives.for.aspell.txt + echo SQL >> my.false.positives.for.aspell.txt + echo SQL | aspell list --lang=en_US --personal=./my.false.positives.for.aspell.txt + + +We consider these false positives *easy to handle*, as we add them once, and forget about the problem. + +Unfortunately there are also false positives *tricky to handle*. +One example is the word `notfound`, which, under normal circumstances, always is a spelling error for `not found`. +In Fossil terminology, however, `notfound` is a option, use together with the `fossil ui` command, among others. +This means that `notfound` may or may not be a spelling error, depending on the context. +We cannot just add `notfound` to our personal dictionary as we did with `SQL`, as that would prevent us to catch +future spelling errors for `not found`. +Instead, we add all the lines where `notfound` appears to a separate list of *tricky false positives*. +Text added in the future, containing a line with `notfound`, will be detected as a spelling error, +unless the entire line matches an line in the existing list. +The user will then have two options for the error to disappear: + +- Either: Fix the typo (if she/he really meant to type `not found`) +- Or: Add the enire line containing `notfound` to the existing list. + +Finally, there are the technical terms containing symbols, function and variable names etc, for example `blob_appendf()`. +In any software, it is common to refer to function and variable names in comments. +Function and variable names frequently contains underscore `_`, which becomes a real headache when working with `hunspell` and `aspell`. +Both spell checkers consider `_` as a word separator, so in their eyes, `blob_appendf()` is split into `blob` and `appendf`. +`blob` is considered a valid word, while `appendf` is interpreted as a misspelling of `append`. +Logical for a spell checker, unfortunate for us. ☹ +This part needs some real hacking... TBD. + +#### False positives, easy to handle top + +Let's take a string from `src/blob.c` as an example: + +

+ char *blob_sql_text(Blob *p){
+   blob_is_init(p);
+   if( (p->blobFlags & BLOBFLAG_NotSQL) ){
+     fossil_fatal("Internal error: Use of blob_appendf() to construct SQL text"); /* <--- LET'S SPELL CHECK THIS STRING */
+   }
+   return blob_str(p);
+ }
+
+ +There are no visible spelling errors in the string. +But when we run `hunspell` on the string, one false positive is detected, `appendf`: + + echo 'Internal error: Use of blob_appendf() to construct SQL text' | hunspell -d en_US,en_GB -l + + appendf + +%%% TODO %%% CHECK HOWTO CUSTOM ASPELL DICTIONARY AND ccpp mode %%% +cat ../bld/blob_.c | aspell list --lang=en --mode=ccpp + +The solution is to create a personal dictionary containing false positives: + + echo appendf >> false.positives.easy.txt + +Test the dictionary to check that `appendf` disappears: + + echo 'Internal error: Use of blob_appendf() to construct SQL text' | hunspell -d en_US,en_GB -l -p false.positives.easy.txt + + +To apply this technique to all files in the Fossil source code tree, `fosspell` does something similar to one of the `man (1) hunspell` examples: + + EXAMPLES + ... + hunspell -l *.odt | sort | uniq >unrecognized + Saving unrecognized words of ODF documents (filtering duplications). + + hunspell -p unrecognized_but_good *.odt + Interactive spell checking of ODF documents, using the previously saved and reduced + word list, as a personal dictionary, to speed up spell checking. + +In our case, the spell shecking must be done in three steps: + +1. `hunspell -l | sort | uniq > unrecognized.words.from.text.files.txt` +2. `echo | hunspell -l | sort | uniq > unrecognized.words.from.c.comments.txt` +3. `echo | hunspell -l | sort | uniq > unrecognized.words.from.c.strings.txt` + +To join the three files and delete duplicates: + + cat unrecognized.words.from.text.files.txt unrecognized.words.from.c.comments.txt unrecognized.words.from.c.strings.txt \ + | sort | uniq > unrecognized.words.txt + +Now begins the tedious task - to edit `unrecognized.words.txt` to cut out the true spell checking errors, +and to paste them into another file. +(It is A Good Thing™ to report the contents of this file to the Fossil mailing list.) +Separating false postivies from true ones, has to be done manually, or, at least, mostly manually. +The bright side of this tedious task is that it only has do be done once. +That means that when you are using this software, the databases for false positives already exist. + +There are no silver bullets to help us creating the dictionaries, but a few methods to reduce the task: + +- For example, the `www/` directory contains mainly `.wiki` files, with a format similar to HTML, so if we tell `hunspell` +to parse these files as HTML, we may reduce the number of false positives: + + hunspell -l -d en_US,en_GB `find www/ -type f -name "*.wiki"` | sort | \ + uniq > unrecognized.words.from.text.files.txt + hunspell -H -l -d en_US,en_GB `find www/ -type f -name "*.wiki"` | sort | \ + uniq > unrecognized.words.from.text.files.with.H.flag.txt + + wc -l unrecognized.words.* + 794 unrecognized.words.from.wiki.files.txt + 602 unrecognized.words.from.wiki.files.with.H.flag.txt + +By using the `-H` flag, we have now reduced the number of words to check from 792 to 602. + +- When `hunspell` is run to offer suggestions, the line starts with one of these signs: + + OK: * + Miss: & : , , ... + None: # + +Our list of unrecognized words have no `* (OK)` words. The words marked as `& (Miss)` offers suggestions, and may be misspelled words, +even if most of them actually are false positives. +The words marked as `# (None)` offers no suggestions, so we can be pretty sure that they are false positives. + +Let's separated the words into two files: + + + cat unrecognized.words.from.wiki.files.with.H.flag.txt | hunspell -d en_US,en_GB | grep '^#' | \ + cut -d ' ' -f 2 > unrecognized.words.without.suggestions.from.wiki.files.txt + + cat unrecognized.words.from.text.files.with.H.flag.txt | hunspell -d en_US,en_GB | grep '^&' | \ + cut -d ' ' -f 2 > unrecognized.words.with.suggestions.from.wiki.files.txt + + wc -l unrecognized.words.with*suggestions.from.wiki.files.txt + 565 unrecognized.words.with.suggestions.from.wiki.files.txt + 37 unrecognized.words.without.suggestions.from.wiki.files.txt + 602 total + +We can assume that all the words without suggestions are false positives. +Effectively, almost all 37 words in `unrecognized.words.without.suggestions.from.wiki.files.txt` are hash strings. +This file will be used as a "base" for false positives, so we can just copy the file: + + cp unrecognized.words.without.suggestions.from.wiki.files.txt false.positives.easy.txt + +Now + +Even so, the vast majority of these words will be false positives. +Save the edited file as `false.positives.easy.txt`. + +%%% COPY WORDS EITHER TO easy OR TO tricky OR TO true.positives. %%% + +#### False positives, tricky to handle top + +The section above shows an obvious case of a false positive: + +- `appendf`: (part of) a function name + +Anyhow, there are less obvious and more ambigous cases: +One example is the string `notfound`; the commonly used command `fossil ui` has an option called `--notfound` (see `src/main.c`, for example). +Thus, there are several C comments and strings containing the word `notfound`. +The common expression `not found` is also present in the Fossil source code (in `www/tech_overview.wiki`, for example). +How can `hunspell` know if `notfound` is a typo (we really meant to type `not found`) or not (we refer mentioned option `not found`)? +Obviously, it can't. +The problem is that simply adding `notfound` to the dictionary of false positives would not solve the problem, +as misspelling `not found` as `notfound` would never be detected. +(Note that misspelling the other way around, `notfound` as `not found` is much less of a problem, +as the compiler or test programs would detect this misspelling as something similar to "unknown option `--not found`".) + +One way to deal with such words is to add the entire line where the word occurs to a special database for "known tricky words". +The database contains the +Any occurences of `notfound` + + + for l in "`grep -n notfound ../src/main.c`"; do printf "%s\n" "$l"; done + + + + + +%%% +--personal=./false.positives.easy.txt + + notfound: src/main.c: (2361,7) (2540,51) ... + +The column position is needed to deal with lines like this (src/main.c, line 2361): + + ** --notfound URL use URL as "HTTP 404, object not found" page. + + + +%%% +The easiest way to do this is probably to: + +1. run `./fosspell spell all` - creates `typo_spell.txt` +2. `typo_spell.txt` is quite big, so it is rather tedious to check all false positives manually. + Anyway, this is a one-time job. + As a helping tool, + +Even when filtering out very long words and/or strings that obviously not are words (such as hashes), +there are still lots of more false positives than real misspelled words. +The solution is to create a database (a plain text file) of false positives. +The easiest way to create the database is to delete the (relatively few) real misspelled words from a result, +and keep the false positives as a database. +Subsequent runs of `fosspell`, using the same version of the Fossil source code tree, will have then have no false positives. +Running `fosspell` using an updated version of the Fossil source code tree, will probably cause a few false positives (new variable names, etc.), +but it should be a minor task to add them to the database. +Use the following command to add false positives to the database: + + ./fosspell addfalse ?FILENAME? | ?TEXT? + +Words can be added from a file, or as words directly from the command line. +The new words are added, one at a line, to the database, which is then resorted alphabetically. +A warning message is shown when trying to add already existing words to the database. +TBD. + + + +### Why use both `aspell` and `hunspell`? top + +%%% BASICALLY: ASPELL FOR SOME STUFF, HUNSPELL FOR OTHER. +NOT VERY WELL DOCUMENTED example 1: DOCUMENTATION FOR ccpp ASPELL FILTER MODE example 2: special characters possible in hunspell? +perl MODE NOT DUCUMENTED + +Well, `aspell` may be faster/more stable/other reason than `hunspell`, **but**: + +1. Spell checking source code means including many non-alphabetic characters. +`hunspell` deals better with such characters than `aspell`. + + cat ../src/main.c | aspell -a --lang en| grep -i Error # NOT OK + cat ../src/main.c | hunspell [-a] -d en_US| grep -i Error # SLOWER, BUT OK (-a FLAG NOT NEEDED) + +2. When detecting [tricky false positives](#fptricky), +it is useful to be able to print the entire line where the spell checked word is found. +This can be done in `hunspell` using the `-L` option: + + cat ../src/main.c | hunspell -d en_US -L | grep notfound + +AFAIK, there is no such option in `aspell`. + + +### Links top + +This page: + +Hunspell: + +Text::Hunspell Perl module: + ADDED aspell/Makefile Index: aspell/Makefile ================================================================== --- /dev/null +++ aspell/Makefile @@ -0,0 +1,32 @@ +CFLAGS+=-g -Wall -Wextra -Werror -ansi -pedantic +CFLAGS+=-I/usr/local/include +LDFLAGS=-L/usr/local/lib +LDLIBS=-laspell + +ASPELL = aspell +ASPELL_FLAGS = --dont-validate-words + +DICTLANG=en +#DICTNAME = fossil +DICTNAME = en_FOSSIL +DEPFILES = ${DICTNAME}.wl ${DICTNAME}_affix.dat ${DICTNAME}_phonet.dat +WORD_LIST = ${DICTNAME}.wl + + +all: dict main example-c list-dicts + +main: Makefile main.c + +dict: ${DICTNAME}.rws + +${DICTNAME}.rws: ${DEPFILES} + cat ${WORD_LIST} | ${ASPELL} ${ASPELL_FLAGS} --lang=${DICTLANG} create master ./$@ + +testOLD: + @echo cheira | aspell list --master=./fossil.rws --data-dir=. --dict-dir=. --lang=fossil + +test: + @echo blob_appendf | aspell list --master=./fossil.rws --dict-dir=. --lang=fossil + +clean: + rm -f ${DICTNAME}.rws ADDED aspell/blob_.c Index: aspell/blob_.c ================================================================== --- /dev/null +++ aspell/blob_.c @@ -0,0 +1,1267 @@ +#line 1 "./src/blob.c" +/* +** Copyright (c) 2006 D. Richard Hipp +** +** This program is free software; you can redistribute it and/or +** modify it under the terms of the Simplified BSD License (also +** known as the "2-Clause License" or "FreeBSD License".) + +** This program is distributed in the hope that it will be useful, +** but without any warranty; without even the implied warranty of +** merchantability or fitness for a particular purpose. +** +** Author contact information: +** drh@hwaci.com +** http://www.hwaci.com/drh/ +** +******************************************************************************* +** +** A Blob is a variable-length containers for arbitrary string +** or binary data. +*/ +#include "config.h" +#if defined(FOSSIL_ENABLE_MINIZ) +# define MINIZ_HEADER_FILE_ONLY +# include "miniz.c" +#else +# include +#endif +#include "blob.h" +#if defined(_WIN32) +#include +#include +#endif + +#if INTERFACE +/* +** A Blob can hold a string or a binary object of arbitrary size. The +** size changes as necessary. +*/ +struct Blob { + unsigned int nUsed; /* Number of bytes used in aData[] */ + unsigned int nAlloc; /* Number of bytes allocated for aData[] */ + unsigned int iCursor; /* Next character of input to parse */ + unsigned int blobFlags; /* One or more BLOBFLAG_* bits */ + char *aData; /* Where the information is stored */ + void (*xRealloc)(Blob*, unsigned int); /* Function to reallocate the buffer */ +}; + +/* +** Allowed values for Blob.blobFlags +*/ +#define BLOBFLAG_NotSQL 0x0001 /* Non-SQL text */ + +/* +** The current size of a Blob +*/ +#define blob_size(X) ((X)->nUsed) + +/* +** The buffer holding the blob data +*/ +#define blob_buffer(X) ((X)->aData) + +/* +** Seek whence parameter values +*/ +#define BLOB_SEEK_SET 1 +#define BLOB_SEEK_CUR 2 +#define BLOB_SEEK_END 3 + +#endif /* INTERFACE */ + +/* +** Make sure a blob is initialized +*/ +#define blob_is_init(x) \ + assert((x)->xRealloc==blobReallocMalloc || (x)->xRealloc==blobReallocStatic) + +/* +** Make sure a blob does not contain malloced memory. +** +** This might fail if we are unlucky and x is uninitialized. For that +** reason it should only be used locally for debugging. Leave it turned +** off for production. +*/ +#if 0 /* Enable for debugging only */ +#define assert_blob_is_reset(x) assert(blob_is_reset(x)) +#else +#define assert_blob_is_reset(x) +#endif + + + +/* +** We find that the built-in isspace() function does not work for +** some international character sets. So here is a substitute. +*/ +int fossil_isspace(char c){ + return c==' ' || (c<='\r' && c>='\t'); +} + +/* +** Other replacements for ctype.h functions. +*/ +int fossil_islower(char c){ return c>='a' && c<='z'; } +int fossil_isupper(char c){ return c>='A' && c<='Z'; } +int fossil_isdigit(char c){ return c>='0' && c<='9'; } +int fossil_tolower(char c){ + return fossil_isupper(c) ? c - 'A' + 'a' : c; +} +int fossil_toupper(char c){ + return fossil_islower(c) ? c - 'a' + 'A' : c; +} +int fossil_isalpha(char c){ + return (c>='a' && c<='z') || (c>='A' && c<='Z'); +} +int fossil_isalnum(char c){ + return (c>='a' && c<='z') || (c>='A' && c<='Z') || (c>='0' && c<='9'); +} + + +/* +** COMMAND: test-isspace +** +** Verify that the fossil_isspace() routine is working correctly by +** testing it on all possible inputs. +*/ +void isspace_cmd(void){ + int i; + for(i=0; i<=255; i++){ + if( i==' ' || i=='\n' || i=='\t' || i=='\v' + || i=='\f' || i=='\r' ){ + assert( fossil_isspace((char)i) ); + }else{ + assert( !fossil_isspace((char)i) ); + } + } + fossil_print("All 256 characters OK\n"); +} + +/* +** This routine is called if a blob operation fails because we +** have run out of memory. +*/ +static void blob_panic(void){ + static const char zErrMsg[] = "out of memory\n"; + fputs(zErrMsg, stderr); + fossil_exit(1); +} + +/* +** A reallocation function that assumes that aData came from malloc(). +** This function attempts to resize the buffer of the blob to hold +** newSize bytes. +** +** No attempt is made to recover from an out-of-memory error. +** If an OOM error occurs, an error message is printed on stderr +** and the program exits. +*/ +void blobReallocMalloc(Blob *pBlob, unsigned int newSize){ + if( newSize==0 ){ + free(pBlob->aData); + pBlob->aData = 0; + pBlob->nAlloc = 0; + pBlob->nUsed = 0; + pBlob->iCursor = 0; + pBlob->blobFlags = 0; + }else if( newSize>pBlob->nAlloc || newSizenAlloc-4000 ){ + char *pNew = fossil_realloc(pBlob->aData, newSize); + pBlob->aData = pNew; + pBlob->nAlloc = newSize; + if( pBlob->nUsed>pBlob->nAlloc ){ + pBlob->nUsed = pBlob->nAlloc; + } + } +} + +/* +** An initializer for Blobs +*/ +#if INTERFACE +#define BLOB_INITIALIZER {0,0,0,0,0,blobReallocMalloc} +#endif +const Blob empty_blob = BLOB_INITIALIZER; + +/* +** A reallocation function for when the initial string is in unmanaged +** space. Copy the string to memory obtained from malloc(). +*/ +static void blobReallocStatic(Blob *pBlob, unsigned int newSize){ + if( newSize==0 ){ + *pBlob = empty_blob; + }else{ + char *pNew = fossil_malloc( newSize ); + if( pBlob->nUsed>newSize ) pBlob->nUsed = newSize; + memcpy(pNew, pBlob->aData, pBlob->nUsed); + pBlob->aData = pNew; + pBlob->xRealloc = blobReallocMalloc; + pBlob->nAlloc = newSize; + } +} + +/* +** Reset a blob to be an empty container. +*/ +void blob_reset(Blob *pBlob){ + blob_is_init(pBlob); + pBlob->xRealloc(pBlob, 0); +} + + +/* +** Return true if the blob has been zeroed - in other words if it contains +** no malloced memory. This only works reliably if the blob has been +** initialized - it can return a false negative on an uninitialized blob. +*/ +int blob_is_reset(Blob *pBlob){ + if( pBlob==0 ) return 1; + if( pBlob->nUsed ) return 0; + if( pBlob->xRealloc==blobReallocMalloc && pBlob->nAlloc ) return 0; + return 1; +} + +/* +** Initialize a blob to a string or byte-array constant of a specified length. +** Any prior data in the blob is discarded. +*/ +void blob_init(Blob *pBlob, const char *zData, int size){ + assert_blob_is_reset(pBlob); + if( zData==0 ){ + *pBlob = empty_blob; + }else{ + if( size<=0 ) size = strlen(zData); + pBlob->nUsed = pBlob->nAlloc = size; + pBlob->aData = (char*)zData; + pBlob->iCursor = 0; + pBlob->blobFlags = 0; + pBlob->xRealloc = blobReallocStatic; + } +} + +/* +** Initialize a blob to a nul-terminated string. +** Any prior data in the blob is discarded. +*/ +void blob_set(Blob *pBlob, const char *zStr){ + blob_init(pBlob, zStr, -1); +} + +/* +** Initialize a blob to a nul-terminated string obtained from fossil_malloc(). +** The blob will take responsibility for freeing the string. +*/ +void blob_set_dynamic(Blob *pBlob, char *zStr){ + blob_init(pBlob, zStr, -1); + pBlob->xRealloc = blobReallocMalloc; +} + +/* +** Initialize a blob to an empty string. +*/ +void blob_zero(Blob *pBlob){ + static const char zEmpty[] = ""; + assert_blob_is_reset(pBlob); + pBlob->nUsed = 0; + pBlob->nAlloc = 1; + pBlob->aData = (char*)zEmpty; + pBlob->iCursor = 0; + pBlob->blobFlags = 0; + pBlob->xRealloc = blobReallocStatic; +} + +/* +** Append text or data to the end of a blob. +*/ +void blob_append(Blob *pBlob, const char *aData, int nData){ + assert( aData!=0 || nData==0 ); + blob_is_init(pBlob); + if( nData<0 ) nData = strlen(aData); + if( nData==0 ) return; + if( pBlob->nUsed + nData >= pBlob->nAlloc ){ + pBlob->xRealloc(pBlob, pBlob->nUsed + nData + pBlob->nAlloc + 100); + if( pBlob->nUsed + nData >= pBlob->nAlloc ){ + blob_panic(); + } + } + memcpy(&pBlob->aData[pBlob->nUsed], aData, nData); + pBlob->nUsed += nData; + pBlob->aData[pBlob->nUsed] = 0; /* Blobs are always nul-terminated */ +} + +/* +** Copy a blob +*/ +void blob_copy(Blob *pTo, Blob *pFrom){ + blob_is_init(pFrom); + blob_zero(pTo); + blob_append(pTo, blob_buffer(pFrom), blob_size(pFrom)); +} + +/* +** Return a pointer to a null-terminated string for a blob. +*/ +char *blob_str(Blob *p){ + blob_is_init(p); + if( p->nUsed==0 ){ + blob_append(p, "", 1); /* NOTE: Changes nUsed. */ + p->nUsed = 0; + } + if( p->aData[p->nUsed]!=0 ){ + blob_materialize(p); + } + return p->aData; +} + +/* +** Return a pointer to a null-terminated string for a blob that has +** been created using blob_append_sql() and not blob_appendf(). If +** text was ever added using blob_appendf() then throw an error. +*/ +char *blob_sql_text(Blob *p){ + blob_is_init(p); + if( (p->blobFlags & BLOBFLAG_NotSQL) ){ + fossil_fatal("Internal error: Use of blob_appendf() to construct SQL text"); + } + return blob_str(p); +} + + +/* +** Return a pointer to a null-terminated string for a blob. +** +** WARNING: If the blob is ephemeral, it might cause a '\000' +** character to be inserted into the middle of the parent blob. +** Example: Suppose p is a token extracted from some larger +** blob pBig using blob_token(). If you call this routine on p, +** then a '\000' character will be inserted in the middle of +** pBig in order to cause p to be nul-terminated. If pBig +** should not be modified, then use blob_str() instead of this +** routine. blob_str() will make a copy of the p if necessary +** to avoid modifying pBig. +*/ +char *blob_terminate(Blob *p){ + blob_is_init(p); + if( p->nUsed==0 ) return ""; + p->aData[p->nUsed] = 0; + return p->aData; +} + +/* +** Compare two blobs. Return negative, zero, or positive if the first +** blob is less then, equal to, or greater than the second. +*/ +int blob_compare(Blob *pA, Blob *pB){ + int szA, szB, sz, rc; + blob_is_init(pA); + blob_is_init(pB); + szA = blob_size(pA); + szB = blob_size(pB); + sz = szAnUsed==sizeof(S)-1 && memcmp((B)->aData,S,sizeof(S)-1)==0) +#endif + + +/* +** Attempt to resize a blob so that its internal buffer is +** nByte in size. The blob is truncated if necessary. +*/ +void blob_resize(Blob *pBlob, unsigned int newSize){ + pBlob->xRealloc(pBlob, newSize+1); + pBlob->nUsed = newSize; + pBlob->aData[newSize] = 0; +} + +/* +** Make sure a blob is nul-terminated and is not a pointer to unmanaged +** space. Return a pointer to the data. +*/ +char *blob_materialize(Blob *pBlob){ + blob_resize(pBlob, pBlob->nUsed); + return pBlob->aData; +} + + +/* +** Call dehttpize on a blob. This causes an ephemeral blob to be +** materialized. +*/ +void blob_dehttpize(Blob *pBlob){ + blob_materialize(pBlob); + pBlob->nUsed = dehttpize(pBlob->aData); +} + +/* +** Extract N bytes from blob pFrom and use it to initialize blob pTo. +** Return the actual number of bytes extracted. +** +** After this call completes, pTo will be an ephemeral blob. +*/ +int blob_extract(Blob *pFrom, int N, Blob *pTo){ + blob_is_init(pFrom); + assert_blob_is_reset(pTo); + if( pFrom->iCursor + N > pFrom->nUsed ){ + N = pFrom->nUsed - pFrom->iCursor; + if( N<=0 ){ + blob_zero(pTo); + return 0; + } + } + pTo->nUsed = N; + pTo->nAlloc = N; + pTo->aData = &pFrom->aData[pFrom->iCursor]; + pTo->iCursor = 0; + pTo->xRealloc = blobReallocStatic; + pFrom->iCursor += N; + return N; +} + +/* +** Rewind the cursor on a blob back to the beginning. +*/ +void blob_rewind(Blob *p){ + p->iCursor = 0; +} + +/* +** Seek the cursor in a blob to the indicated offset. +*/ +int blob_seek(Blob *p, int offset, int whence){ + if( whence==BLOB_SEEK_SET ){ + p->iCursor = offset; + }else if( whence==BLOB_SEEK_CUR ){ + p->iCursor += offset; + }else if( whence==BLOB_SEEK_END ){ + p->iCursor = p->nUsed + offset - 1; + } + if( p->iCursor>p->nUsed ){ + p->iCursor = p->nUsed; + } + return p->iCursor; +} + +/* +** Return the current offset into the blob +*/ +int blob_tell(Blob *p){ + return p->iCursor; +} + +/* +** Extract a single line of text from pFrom beginning at the current +** cursor location and use that line of text to initialize pTo. +** pTo will include the terminating \n. Return the number of bytes +** in the line including the \n at the end. 0 is returned at +** end-of-file. +** +** The cursor of pFrom is left pointing at the first byte past the +** \n that terminated the line. +** +** pTo will be an ephermeral blob. If pFrom changes, it might alter +** pTo as well. +*/ +int blob_line(Blob *pFrom, Blob *pTo){ + char *aData = pFrom->aData; + int n = pFrom->nUsed; + int i = pFrom->iCursor; + + while( iiCursor, pTo); + return pTo->nUsed; +} + +/* +** Trim whitespace off of the end of a blob. Return the number +** of characters remaining. +** +** All this does is reduce the length counter. This routine does +** not insert a new zero terminator. +*/ +int blob_trim(Blob *p){ + char *z = p->aData; + int n = p->nUsed; + while( n>0 && fossil_isspace(z[n-1]) ){ n--; } + p->nUsed = n; + return n; +} + +/* +** Extract a single token from pFrom and use it to initialize pTo. +** Return the number of bytes in the token. If no token is found, +** return 0. +** +** A token consists of one or more non-space characters. Leading +** whitespace is ignored. +** +** The cursor of pFrom is left pointing at the first character past +** the end of the token. +** +** pTo will be an ephermeral blob. If pFrom changes, it might alter +** pTo as well. +*/ +int blob_token(Blob *pFrom, Blob *pTo){ + char *aData = pFrom->aData; + int n = pFrom->nUsed; + int i = pFrom->iCursor; + while( iiCursor = i; + while( iiCursor, pTo); + while( iiCursor = i; + return pTo->nUsed; +} + +/* +** Extract a single SQL token from pFrom and use it to initialize pTo. +** Return the number of bytes in the token. If no token is found, +** return 0. +** +** An SQL token consists of one or more non-space characters. If the +** first character is ' then the token is terminated by a matching ' +** (ignoring double '') or by the end of the string +** +** The cursor of pFrom is left pointing at the first character past +** the end of the token. +** +** pTo will be an ephermeral blob. If pFrom changes, it might alter +** pTo as well. +*/ +int blob_sqltoken(Blob *pFrom, Blob *pTo){ + char *aData = pFrom->aData; + int n = pFrom->nUsed; + int i = pFrom->iCursor; + while( iiCursor = i; + if( aData[i]=='\'' ){ + i++; + while( iiCursor, pTo); + while( iiCursor = i; + return pTo->nUsed; +} + +/* +** Extract everything from the current cursor to the end of the blob +** into a new blob. The new blob is an ephemerial reference to the +** original blob. The cursor of the original blob is unchanged. +*/ +int blob_tail(Blob *pFrom, Blob *pTo){ + int iCursor = pFrom->iCursor; + blob_extract(pFrom, pFrom->nUsed-pFrom->iCursor, pTo); + pFrom->iCursor = iCursor; + return pTo->nUsed; +} + +/* +** Copy N lines of text from pFrom into pTo. The copy begins at the +** current cursor position of pIn. The pIn cursor is left pointing +** at the first character past the last \n copied. +** +** If pTo==NULL then this routine simply skips over N lines. +*/ +void blob_copy_lines(Blob *pTo, Blob *pFrom, int N){ + char *z = pFrom->aData; + int i = pFrom->iCursor; + int n = pFrom->nUsed; + int cnt = 0; + + if( N==0 ) return; + while( iaData[pFrom->iCursor], i - pFrom->iCursor); + } + pFrom->iCursor = i; +} + +/* +** Return true if the blob contains a valid UUID_SIZE-digit base16 identifier. +*/ +int blob_is_uuid(Blob *pBlob){ + return blob_size(pBlob)==UUID_SIZE + && validate16(blob_buffer(pBlob), UUID_SIZE); +} + +/* +** Return true if the blob contains a valid filename +*/ +int blob_is_filename(Blob *pBlob){ + return file_is_simple_pathname(blob_str(pBlob), 1); +} + +/* +** Return true if the blob contains a valid 32-bit integer. Store +** the integer value in *pValue. +*/ +int blob_is_int(Blob *pBlob, int *pValue){ + const char *z = blob_buffer(pBlob); + int i, n, c, v; + n = blob_size(pBlob); + v = 0; + for(i=0; i='0' && c<='9'; i++){ + v = v*10 + c - '0'; + } + if( i==n ){ + *pValue = v; + return 1; + }else{ + return 0; + } +} + +/* +** Return true if the blob contains a valid 64-bit integer. Store +** the integer value in *pValue. +*/ +int blob_is_int64(Blob *pBlob, sqlite3_int64 *pValue){ + const char *z = blob_buffer(pBlob); + int i, n, c; + sqlite3_int64 v; + n = blob_size(pBlob); + v = 0; + for(i=0; i='0' && c<='9'; i++){ + v = v*10 + c - '0'; + } + if( i==n ){ + *pValue = v; + return 1; + }else{ + return 0; + } +} + +/* +** Zero or reset an array of Blobs. +*/ +void blobarray_zero(Blob *aBlob, int n){ + int i; + for(i=0; iblobFlags |= BLOBFLAG_NotSQL; + } +} +void blob_append_sql(Blob *pBlob, const char *zFormat, ...){ + if( pBlob ){ + va_list ap; + va_start(ap, zFormat); + vxprintf(pBlob, zFormat, ap); + va_end(ap); + } +} +void blob_vappendf(Blob *pBlob, const char *zFormat, va_list ap){ + if( pBlob ) vxprintf(pBlob, zFormat, ap); +} + +/* +** Initialize a blob to the data on an input channel. Return +** the number of bytes read into the blob. Any prior content +** of the blob is discarded, not freed. +*/ +int blob_read_from_channel(Blob *pBlob, FILE *in, int nToRead){ + size_t n; + blob_zero(pBlob); + if( nToRead<0 ){ + char zBuf[10000]; + while( !feof(in) ){ + n = fread(zBuf, 1, sizeof(zBuf), in); + if( n>0 ){ + blob_append(pBlob, zBuf, n); + } + } + }else{ + blob_resize(pBlob, nToRead); + n = fread(blob_buffer(pBlob), 1, nToRead, in); + blob_resize(pBlob, n); + } + return blob_size(pBlob); +} + +/* +** Initialize a blob to be the content of a file. If the filename +** is blank or "-" then read from standard input. +** +** Any prior content of the blob is discarded, not freed. +** +** Return the number of bytes read. Calls fossil_fatal() on error (i.e. +** it exit()s and does not return). +*/ +int blob_read_from_file(Blob *pBlob, const char *zFilename){ + int size, got; + FILE *in; + if( zFilename==0 || zFilename[0]==0 + || (zFilename[0]=='-' && zFilename[1]==0) ){ + return blob_read_from_channel(pBlob, stdin, -1); + } + size = file_wd_size(zFilename); + blob_zero(pBlob); + if( size<0 ){ + fossil_fatal("no such file: %s", zFilename); + } + if( size==0 ){ + return 0; + } + blob_resize(pBlob, size); + in = fossil_fopen(zFilename, "rb"); + if( in==0 ){ + fossil_fatal("cannot open %s for reading", zFilename); + } + got = fread(blob_buffer(pBlob), 1, size, in); + fclose(in); + if( got= 0 ){ + return nWrote; + } + fflush(stdout); + _setmode(_fileno(stdout), _O_BINARY); +#endif + fwrite(blob_buffer(pBlob), 1, nWrote, stdout); +#if defined(_WIN32) + fflush(stdout); + _setmode(_fileno(stdout), _O_TEXT); +#endif + }else{ + file_mkfolder(zFilename, 1, 0); + out = fossil_fopen(zFilename, "wb"); + if( out==0 ){ + fossil_fatal_recursive("unable to open file \"%s\" for writing", + zFilename); + return 0; + } + blob_is_init(pBlob); + nWrote = fwrite(blob_buffer(pBlob), 1, blob_size(pBlob), out); + fclose(out); + if( nWrote!=blob_size(pBlob) ){ + fossil_fatal_recursive("short write: %d of %d bytes to %s", nWrote, + blob_size(pBlob), zFilename); + } + } + return nWrote; +} + +/* +** Compress a blob pIn. Store the result in pOut. It is ok for pIn and +** pOut to be the same blob. +** +** pOut must either be the same as pIn or else uninitialized. +*/ +void blob_compress(Blob *pIn, Blob *pOut){ + unsigned int nIn = blob_size(pIn); + unsigned int nOut = 13 + nIn + (nIn+999)/1000; + unsigned long int nOut2; + unsigned char *outBuf; + Blob temp; + blob_zero(&temp); + blob_resize(&temp, nOut+4); + outBuf = (unsigned char*)blob_buffer(&temp); + outBuf[0] = nIn>>24 & 0xff; + outBuf[1] = nIn>>16 & 0xff; + outBuf[2] = nIn>>8 & 0xff; + outBuf[3] = nIn & 0xff; + nOut2 = (long int)nOut; + compress(&outBuf[4], &nOut2, + (unsigned char*)blob_buffer(pIn), blob_size(pIn)); + if( pOut==pIn ) blob_reset(pOut); + assert_blob_is_reset(pOut); + *pOut = temp; + blob_resize(pOut, nOut2+4); +} + +/* +** COMMAND: test-compress +** +** Usage: %fossil test-compress INPUTFILE OUTPUTFILE +** +** Run compression on INPUTFILE and write the result into OUTPUTFILE. +** +** This is used to test and debug the blob_compress() routine. +*/ +void compress_cmd(void){ + Blob f; + if( g.argc!=4 ) usage("INPUTFILE OUTPUTFILE"); + blob_read_from_file(&f, g.argv[2]); + blob_compress(&f, &f); + blob_write_to_file(&f, g.argv[3]); +} + +/* +** Compress the concatenation of a blobs pIn1 and pIn2. Store the result +** in pOut. +** +** pOut must be either uninitialized or must be the same as either pIn1 or +** pIn2. +*/ +void blob_compress2(Blob *pIn1, Blob *pIn2, Blob *pOut){ + unsigned int nIn = blob_size(pIn1) + blob_size(pIn2); + unsigned int nOut = 13 + nIn + (nIn+999)/1000; + unsigned char *outBuf; + z_stream stream; + Blob temp; + blob_zero(&temp); + blob_resize(&temp, nOut+4); + outBuf = (unsigned char*)blob_buffer(&temp); + outBuf[0] = nIn>>24 & 0xff; + outBuf[1] = nIn>>16 & 0xff; + outBuf[2] = nIn>>8 & 0xff; + outBuf[3] = nIn & 0xff; + stream.zalloc = (alloc_func)0; + stream.zfree = (free_func)0; + stream.opaque = 0; + stream.avail_out = nOut; + stream.next_out = &outBuf[4]; + deflateInit(&stream, 9); + stream.avail_in = blob_size(pIn1); + stream.next_in = (unsigned char*)blob_buffer(pIn1); + deflate(&stream, 0); + stream.avail_in = blob_size(pIn2); + stream.next_in = (unsigned char*)blob_buffer(pIn2); + deflate(&stream, 0); + deflate(&stream, Z_FINISH); + blob_resize(&temp, stream.total_out + 4); + deflateEnd(&stream); + if( pOut==pIn1 ) blob_reset(pOut); + if( pOut==pIn2 ) blob_reset(pOut); + assert_blob_is_reset(pOut); + *pOut = temp; +} + +/* +** COMMAND: test-compress-2 +** +** Usage: %fossil test-compress-2 IN1 IN2 OUT +** +** Read files IN1 and IN2, concatenate the content, compress the +** content, then write results into OUT. +** +** This is used to test and debug the blob_compress2() routine. +*/ +void compress2_cmd(void){ + Blob f1, f2; + if( g.argc!=5 ) usage("INPUTFILE1 INPUTFILE2 OUTPUTFILE"); + blob_read_from_file(&f1, g.argv[2]); + blob_read_from_file(&f2, g.argv[3]); + blob_compress2(&f1, &f2, &f1); + blob_write_to_file(&f1, g.argv[4]); +} + +/* +** Uncompress blob pIn and store the result in pOut. It is ok for pIn and +** pOut to be the same blob. +** +** pOut must be either uninitialized or the same as pIn. +*/ +int blob_uncompress(Blob *pIn, Blob *pOut){ + unsigned int nOut; + unsigned char *inBuf; + unsigned int nIn = blob_size(pIn); + Blob temp; + int rc; + unsigned long int nOut2; + if( nIn<=4 ){ + return 0; + } + inBuf = (unsigned char*)blob_buffer(pIn); + nOut = (inBuf[0]<<24) + (inBuf[1]<<16) + (inBuf[2]<<8) + inBuf[3]; + blob_zero(&temp); + blob_resize(&temp, nOut+1); + nOut2 = (long int)nOut; + rc = uncompress((unsigned char*)blob_buffer(&temp), &nOut2, + &inBuf[4], nIn - 4); + if( rc!=Z_OK ){ + blob_reset(&temp); + return 1; + } + blob_resize(&temp, nOut2); + if( pOut==pIn ) blob_reset(pOut); + assert_blob_is_reset(pOut); + *pOut = temp; + return 0; +} + +/* +** COMMAND: test-uncompress +** +** Usage: %fossil test-uncompress IN OUT +** +** Read the content of file IN, uncompress that content, and write the +** result into OUT. This command is intended for testing of the the +** blob_compress() function. +*/ +void uncompress_cmd(void){ + Blob f; + if( g.argc!=4 ) usage("INPUTFILE OUTPUTFILE"); + blob_read_from_file(&f, g.argv[2]); + blob_uncompress(&f, &f); + blob_write_to_file(&f, g.argv[3]); +} + +/* +** COMMAND: test-cycle-compress +** +** Compress and uncompress each file named on the command line. +** Verify that the original content is recovered. +*/ +void test_cycle_compress(void){ + int i; + Blob b1, b2, b3; + for(i=2; iaData; + int j = p->nUsed; + int i, n; + for(i=n=0; i=p->nAlloc ){ + blob_resize(p, j); + z = p->aData; + } + p->nUsed = j; + z[j] = 0; + while( j>i ){ + if( (z[--j] = z[--i]) =='\n' ){ + z[--j] = '\r'; + } + } +} +#endif + +/* +** Remove every \r character from the given blob, replacing each one with +** a \n character if it was not already part of a \r\n pair. +*/ +void blob_to_lf_only(Blob *p){ + int i, j; + char *z = blob_materialize(p); + for(i=j=0; z[i]; i++){ + if( z[i]!='\r' ) z[j++] = z[i]; + else if( z[i+1]!='\n' ) z[j++] = '\n'; + } + z[j] = 0; + p->nUsed = j; +} + +/* +** Convert blob from cp1252 to UTF-8. As cp1252 is a superset +** of iso8859-1, this is useful on UNIX as well. +** +** This table contains the character translations for 0x80..0xA0. +*/ + +static const unsigned short cp1252[32] = { + 0x20ac, 0x81, 0x201A, 0x0192, 0x201E, 0x2026, 0x2020, 0x2021, + 0x02C6, 0x2030, 0x0160, 0x2039, 0x0152, 0x8D, 0x017D, 0x8F, + 0x90, 0x2018, 0x2019, 0x201C, 0x201D, 0x2022, 0x2013, 0x2014, + 0x2DC, 0x2122, 0x0161, 0x203A, 0x0153, 0x9D, 0x017E, 0x0178 +}; + +void blob_cp1252_to_utf8(Blob *p){ + unsigned char *z = (unsigned char *)p->aData; + int j = p->nUsed; + int i, n; + for(i=n=0; i=0x80 ){ + if( (z[i]<0xa0) && (cp1252[z[i]&0x1f]>=0x800) ){ + n++; + } + n++; + } + } + j += n; + if( j>=p->nAlloc ){ + blob_resize(p, j); + z = (unsigned char *)p->aData; + } + p->nUsed = j; + z[j] = 0; + while( j>i ){ + if( z[--i]>=0x80 ){ + if( z[i]<0xa0 ){ + unsigned short sym = cp1252[z[i]&0x1f]; + if( sym>=0x800 ){ + z[--j] = 0x80 | (sym&0x3f); + z[--j] = 0x80 | ((sym>>6)&0x3f); + z[--j] = 0xe0 | (sym>>12); + }else{ + z[--j] = 0x80 | (sym&0x3f); + z[--j] = 0xc0 | (sym>>6); + } + }else{ + z[--j] = 0x80 | (z[i]&0x3f); + z[--j] = 0xC0 | (z[i]>>6); + } + }else{ + z[--j] = z[i]; + } + } +} + +/* +** Shell-escape the given string. Append the result to a blob. +*/ +void shell_escape(Blob *pBlob, const char *zIn){ + int n = blob_size(pBlob); + int k = strlen(zIn); + int i, c; + char *z; + for(i=0; (c = zIn[i])!=0; i++){ + if( fossil_isspace(c) || c=='"' || (c=='\\' && zIn[i+1]!=0) ){ + blob_appendf(pBlob, "\"%s\"", zIn); + z = blob_buffer(pBlob); + for(i=n+1; i<=n+k; i++){ + if( z[i]=='"' ) z[i] = '_'; + } + return; + } + } + blob_append(pBlob, zIn, -1); +} + +/* +** A read(2)-like impl for the Blob class. Reads (copies) up to nLen +** bytes from pIn, starting at position pIn->iCursor, and copies them +** to pDest (which must be valid memory at least nLen bytes long). +** +** Returns the number of bytes read/copied, which may be less than +** nLen (if end-of-blob is encountered). +** +** Updates pIn's cursor. +** +** Returns 0 if pIn contains no data. +*/ +unsigned int blob_read(Blob *pIn, void * pDest, unsigned int nLen ){ + if( !pIn->aData || (pIn->iCursor >= pIn->nUsed) ){ + return 0; + } else if( (pIn->iCursor + nLen) > (unsigned int)pIn->nUsed ){ + nLen = (unsigned int) (pIn->nUsed - pIn->iCursor); + } + assert( pIn->nUsed > pIn->iCursor ); + assert( (pIn->iCursor+nLen) <= pIn->nUsed ); + if( nLen ){ + memcpy( pDest, pIn->aData, nLen ); + pIn->iCursor += nLen; + } + return nLen; +} + +/* +** Swaps the contents of the given blobs. Results +** are unspecified if either value is NULL or both +** point to the same blob. +*/ +void blob_swap( Blob *pLeft, Blob *pRight ){ + Blob swap = *pLeft; + *pLeft = *pRight; + *pRight = swap; +} + +/* +** Strip a possible byte-order-mark (BOM) from the blob. On Windows, if there +** is either no BOM at all or an (le/be) UTF-16 BOM, a conversion to UTF-8 is +** done. If useMbcs is false and there is no BOM, the input string is assumed +** to be UTF-8 already, so no conversion is done. +*/ +void blob_to_utf8_no_bom(Blob *pBlob, int useMbcs){ + char *zUtf8; + int bomSize = 0; + int bomReverse = 0; + if( starts_with_utf8_bom(pBlob, &bomSize) ){ + struct Blob temp; + zUtf8 = blob_str(pBlob) + bomSize; + blob_zero(&temp); + blob_append(&temp, zUtf8, -1); + blob_swap(pBlob, &temp); + blob_reset(&temp); + }else if( starts_with_utf16_bom(pBlob, &bomSize, &bomReverse) ){ + zUtf8 = blob_buffer(pBlob); + if( bomReverse ){ + /* Found BOM, but with reversed bytes */ + unsigned int i = blob_size(pBlob); + while( i>0 ){ + /* swap bytes of unicode representation */ + char zTemp = zUtf8[--i]; + zUtf8[i] = zUtf8[i-1]; + zUtf8[--i] = zTemp; + } + } + /* Make sure the blob contains two terminating 0-bytes */ + blob_append(pBlob, "", 1); + zUtf8 = blob_str(pBlob) + bomSize; + zUtf8 = fossil_unicode_to_utf8(zUtf8); + blob_set_dynamic(pBlob, zUtf8); + }else if( useMbcs && invalid_utf8(pBlob) ){ +#if defined(_WIN32) || defined(__CYGWIN__) + zUtf8 = fossil_mbcs_to_utf8(blob_str(pBlob)); + blob_reset(pBlob); + blob_append(pBlob, zUtf8, -1); + fossil_mbcs_free(zUtf8); +#else + blob_cp1252_to_utf8(pBlob); +#endif /* _WIN32 */ + } +} ADDED aspell/dict-fossil/fossil.cwl Index: aspell/dict-fossil/fossil.cwl ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil.cwl cannot compute difference between binary files ADDED aspell/dict-fossil/fossil.dat Index: aspell/dict-fossil/fossil.dat ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil.dat @@ -0,0 +1,8 @@ +# Fossil data file +name fossil +charset iso8859-1 +special ' -*- +special _ --- +soundslike fossil +affix fossil +affix-compress true ADDED aspell/dict-fossil/fossil.multi Index: aspell/dict-fossil/fossil.multi ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil.multi @@ -0,0 +1,2 @@ +# Generated with Aspell Dicts "proc" script version 0.60.4 +add fossil.rws ADDED aspell/dict-fossil/fossil.rws Index: aspell/dict-fossil/fossil.rws ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil.rws cannot compute difference between binary files ADDED aspell/dict-fossil/fossil.wl Index: aspell/dict-fossil/fossil.wl ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil.wl @@ -0,0 +1,1 @@ +blob_appendf ADDED aspell/dict-fossil/fossil_affix.dat Index: aspell/dict-fossil/fossil_affix.dat ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil_affix.dat @@ -0,0 +1,1203 @@ +# +SET ISO8859-1 +TRY aersiondctmlubpágfvxzíhqñóéúüwkyj_ +# COMPOUNDMIN 3 +# COMPOUNDFLAG N + +# ------------------------------------------------------------------------- +# The affix file below is automatically generated from galician.aff file by +# means of ispellaff2myspell script. Original copyright applies: +# +# Copyright 2001-03 by André Ventas & Ramón Flores, under GNU GPL license +# ------------------------------------------------------------------------- + + +PFX A Y 1 +PFX A 0 anti . + +PFX B Y 1 +PFX B 0 anti- . + +PFX S Y 3 +PFX S 0 des [^hs] +PFX S h des h +PFX S 0 de s + +PFX I Y 4 +PFX I 0 in [^plmr] +PFX I 0 im p +PFX I 0 i [lm] +PFX I 0 ir r + +PFX R Y 1 +PFX R 0 re . + +PFX U Y 1 +PFX U 0 sub [^bhr] + +SFX p Y 12 +SFX p 0 s [^lsmrz] +SFX p l is [au]l +SFX p el éis [^b]el +SFX p ol óis ol +SFX p l s il +SFX p l is [áéíóú]bel +SFX p 0 es [^áé]s +SFX p ás ases ás +SFX p és eses és +SFX p m ns m +SFX p 0 es r +SFX p z ces z + +SFX a Y 1 +SFX a il eis il + +SFX f Y 14 +SFX f o a o +SFX f o as o +SFX f 0 a or +SFX f 0 as or +SFX f 0 a [^é]s +SFX f 0 as [^é]s +SFX f és esa és +SFX f és esas és +SFX f ón oa ón +SFX f ón oas ón +SFX f n 0 án +SFX f n s án +SFX f ao á ao +SFX f ao ás ao + +SFX b Y 8 +SFX b ón ona ón +SFX b ón onas ón +SFX b án ana án +SFX b án anas án +SFX b tor triz tor +SFX b tor trices tor +SFX b dor triz dor +SFX b dor trices dor + +SFX m Y 4 +SFX m o amente o +SFX m 0 mente [^b][^n~][^o] +SFX m ábel abelmente ábel +SFX m íbel ibelmente íbel + +SFX s Y 32 +SFX s o ísimo [^czg]o +SFX s co quísimo co +SFX s zo císimo zo +SFX s o uísimo go +SFX s e ísimo e +SFX s 0 ísimo [lr] +SFX s és esísimo és +SFX s z císimo z +SFX s o ísima [^czg]o +SFX s co quísima co +SFX s zo císima zo +SFX s o uísima go +SFX s e ísima e +SFX s 0 ísima [lr] +SFX s és esísima és +SFX s z císima z +SFX s o ísimos [^czg]o +SFX s co quísimos co +SFX s zo císimos zo +SFX s o uísimos go +SFX s e ísimos e +SFX s 0 ísimos [lr] +SFX s és esísimos és +SFX s z císimos z +SFX s o ísimas [^czg]o +SFX s co quísimas co +SFX s zo císimas zo +SFX s o uísimas go +SFX s e ísimas e +SFX s 0 ísimas [lr] +SFX s és esísimas és +SFX s z císimas z + +SFX d Y 12 +SFX d o idade [^zg]o +SFX d zo cidade zo +SFX d o üidade go +SFX d e idade e +SFX d 0 idade r +SFX d 0 idade [^b][^n~]l +SFX d ábel abilidade ábel +SFX d íbel ibilidade íbel +SFX d z cidade z +SFX d o idade [^zg]o +SFX d o üidades go +SFX d 0 idades [^b][^n~]l + +SFX i Y 10 +SFX i o ismo [^cz]o +SFX i co quismo co +SFX i zo cismo zo +SFX i a ismo [^icg]a +SFX i a smo ia +SFX i ca quismo ca +SFX i ga guismo ga +SFX i e ismo e +SFX i 0 ismo [lr] +SFX i és esismo és + +SFX t Y 18 +SFX t o ista [^cz]o +SFX t co quista co +SFX t zo cista zo +SFX t a ista [^izg]a +SFX t a sta ia +SFX t za cista za +SFX t a uista ga +SFX t e ista e +SFX t 0 ista [lr] +SFX t o istas [^cz]o +SFX t co quistas co +SFX t zo cistas zo +SFX t a istas [^izg]a +SFX t a stas ia +SFX t za cistas za +SFX t a uistas ga +SFX t e istas e +SFX t 0 istas [lr] + +SFX h Y 36 +SFX h o iño [^cgzi]o +SFX h co quiño co +SFX h zo ciño zo +SFX h o uiño go +SFX h e iño e +SFX h 0 iño [^é][s] +SFX h és esiño és +SFX h 0 ciño [irun] +SFX h z ciño z +SFX h ó oiño ó +SFX h é eiño é +SFX h ú uciño ú +SFX h 0 iño l +SFX h o iños [^cgzi]o +SFX h co quiños co +SFX h zo ciños zo +SFX h o uiños go +SFX h e iños e +SFX h 0 iños [^é][s] +SFX h és esiños és +SFX h 0 ciños [irun] +SFX h z ciños z +SFX h ó iños ó +SFX h é eiños é +SFX h ú uciños ú +SFX h 0 iños l +SFX h a iña [^zg]a +SFX h ca quiña ca +SFX h za ciña za +SFX h a uiña ga +SFX h á aiña á +SFX h a iñas [^zg]a +SFX h ca quiñas ca +SFX h za ciñas za +SFX h a uiñas ga +SFX h á aiñas á + +SFX z N 10 +SFX z e iño e +SFX z o iño o +SFX z 0 iño l +SFX z e iños e +SFX z o iños o +SFX z 0 iños l +SFX z e iña e +SFX z o iña o +SFX z e iñas e +SFX z o iñas o + +SFX c Y 6 +SFX c r ción [ae]r +SFX c r ción [^x]ir +SFX c r sición por +SFX c r cións [ae]r +SFX c r cións [^x]ir +SFX c r sicións por + +SFX C Y 8 +SFX C tar ción tar +SFX C onar ón cionar +SFX C ir cción air +SFX C ir ción uir +SFX C tar cións tar +SFX C onar óns cionar +SFX C ir ccións air +SFX C ir cións uir + +SFX o Y 4 +SFX o r zón [aei]r +SFX o r sizón or +SFX o r zóns [aei]r +SFX o r sizóns or + +SFX M Y 4 +SFX M r mento [ai]r +SFX M er imento er +SFX M r mentos [ai]r +SFX M er imentos er + +SFX n Y 6 +SFX n r nte [ae]r +SFX n ir ente ir +SFX n r nente or +SFX n r ntes [ae]r +SFX n ir entes ir +SFX n r nentes or + +SFX D Y 4 +SFX D r dor [aei]r +SFX D r dora [aei]r +SFX D r dores [aei]r +SFX D r doras [aei]r + +SFX v Y 6 +SFX v ar ábel ar +SFX v er íbel er +SFX v ir íbel ir +SFX v ar ábeis ar +SFX v er íbeis er +SFX v ir íbeis ir + +SFX x Y 4 +SFX x r dela [ai]r +SFX x r delas [ai]r +SFX x er idela er +SFX x er idelas er + +SFX e Y 6 +SFX e ar áncia ar +SFX e er éncia er +SFX e ir éncia ir +SFX e ar áncias ar +SFX e er éncias er +SFX e ir éncias ir + +SFX u Y 2 +SFX u r dura [aei]r +SFX u r duras [aei]r + +SFX y Y 4 +SFX y o eiro o +SFX y o eiros o +SFX y a eira a +SFX y a eiras a + +SFX w Y 8 +SFX w 0 so o +SFX w 0 sa o +SFX w 0 sos o +SFX w 0 sas o +SFX w a oso a +SFX w a osa a +SFX w a osos a +SFX w a osas a + +SFX X Y 116 +SFX X ar o ar +SFX X r s ar +SFX X ar a ar +SFX X r mos ar +SFX X r des ar +SFX X r n ar +SFX X ar e [^cguz]ar +SFX X car que car +SFX X ar ue gar +SFX X ar e [^g]uar +SFX X uar üe guar +SFX X zar ce zar +SFX X ar es [^cguz]ar +SFX X car ques car +SFX X ar ues gar +SFX X ar es [^g]uar +SFX X uar ües guar +SFX X zar ces zar +SFX X ar emos [^cguz]ar +SFX X car quemos car +SFX X ar uemos gar +SFX X ar emos [^g]uar +SFX X uar üemos guar +SFX X zar cemos zar +SFX X ar edes [^cguz]ar +SFX X car quedes car +SFX X ar uedes gar +SFX X ar edes [^g]uar +SFX X uar üedes guar +SFX X zar cedes zar +SFX X ar en [^cguz]ar +SFX X car quen car +SFX X ar uen gar +SFX X ar en [^g]uar +SFX X uar üen guar +SFX X zar cen zar +SFX X er o [^cu]er +SFX X cer zo cer +SFX X uer o guer +SFX X r s [^o]er +SFX X oer óis oer +SFX X er e [^o]er +SFX X oer ói oer +SFX X r mos er +SFX X r des er +SFX X r n er +SFX X er a [^cu]er +SFX X cer za cer +SFX X uer a guer +SFX X er as [^cu]er +SFX X cer zas cer +SFX X uer as guer +SFX X er amos [^cu]er +SFX X cer zamos cer +SFX X uer amos guer +SFX X er ades [^cu]er +SFX X cer zades cer +SFX X uer ades guer +SFX X er an [^cu]er +SFX X cer zan cer +SFX X uer an guer +SFX X ir o [^acuü]ir +SFX X r o air +SFX X cir zo cir +SFX X üir uo üir +SFX X ir o [^g]uir +SFX X uir o guir +SFX X ir es [^aüu]ir +SFX X r s [aü]ir +SFX X r s [^g]uir +SFX X ir es guir +SFX X ir e [^aüu]ir +SFX X r 0 [aü]ir +SFX X r 0 [^g]uir +SFX X ir e guir +SFX X r mos [^auü]ir +SFX X ir ímos [aü]ir +SFX X ir ímos [^g]uir +SFX X r mos guir +SFX X r des [^auü]ir +SFX X ir ídes [aü]ir +SFX X ir ídes [^g]uir +SFX X r des guir +SFX X ir en ir +SFX X ir a [^acuü]ir +SFX X r a air +SFX X cir za cir +SFX X üir ua üir +SFX X ir a [^g]uir +SFX X uir a guir +SFX X ir as [^acuü]ir +SFX X r as air +SFX X cir zas cir +SFX X üir uas üir +SFX X ir as [^g]uir +SFX X uir as guir +SFX X ir amos [^acuü]ir +SFX X r amos air +SFX X cir zamos cir +SFX X üir uamos üir +SFX X ir amos [^g]uir +SFX X uir amos guir +SFX X ir ades [^acuü]ir +SFX X r ades air +SFX X cir zades cir +SFX X üir uades üir +SFX X ir ades [^g]uir +SFX X uir ades guir +SFX X ir an [^acuü]ir +SFX X r an air +SFX X cir zan cir +SFX X üir uan üir +SFX X ir an [^g]uir +SFX X uir an guir +SFX X ar á ar +SFX X er é er + +SFX Y Y 192 +SFX Y r ba ar +SFX Y r bas ar +SFX Y ar ábamos ar +SFX Y ar ábades ar +SFX Y r ban ar +SFX Y ar ei [^cguz]ar +SFX Y car quei car +SFX Y ar uei gar +SFX Y ar ei [^g]uar +SFX Y uar üei guar +SFX Y zar cei zar +SFX Y r che ar +SFX Y ar ou ar +SFX Y r stes ar +SFX Y 0 on ar +SFX Y 0 a ar +SFX Y 0 as ar +SFX Y ar áramos ar +SFX Y ar áredes ar +SFX Y 0 an ar +SFX Y 0 ei ar +SFX Y 0 ás ar +SFX Y 0 á ar +SFX Y 0 emos ar +SFX Y 0 edes ar +SFX Y 0 án ar +SFX Y r se ar +SFX Y r ses ar +SFX Y ar ásemos ar +SFX Y ar ásedes ar +SFX Y r sen ar +SFX Y 0 ia ar +SFX Y 0 ias ar +SFX Y 0 íamos ar +SFX Y 0 íades ar +SFX Y 0 ian ar +SFX Y 0 es ar +SFX Y 0 mos ar +SFX Y 0 des ar +SFX Y 0 en ar +SFX Y r ndo ar +SFX Y r do ar +SFX Y r da ar +SFX Y r dos ar +SFX Y r das ar +SFX Y r i ar +SFX Y er ia [^o]er +SFX Y er ía oer +SFX Y er ias [^o]er +SFX Y er ías oer +SFX Y er íamos er +SFX Y er íades er +SFX Y er ian [^o]er +SFX Y er ían oer +SFX Y er in er +SFX Y r che er +SFX Y r u er +SFX Y r stes er +SFX Y 0 on er +SFX Y 0 a er +SFX Y 0 as er +SFX Y er éramos er +SFX Y er érades er +SFX Y 0 an er +SFX Y 0 ei er +SFX Y 0 ás er +SFX Y 0 á er +SFX Y 0 emos er +SFX Y 0 edes er +SFX Y 0 án er +SFX Y r se er +SFX Y r ses er +SFX Y er ésemos er +SFX Y er ésedes er +SFX Y r sen er +SFX Y 0 ia er +SFX Y 0 ias er +SFX Y 0 íamos er +SFX Y 0 íades er +SFX Y 0 ian er +SFX Y 0 es er +SFX Y 0 mos er +SFX Y 0 des er +SFX Y 0 en er +SFX Y r ndo er +SFX Y er ido [^o]er +SFX Y er ído oer +SFX Y er ida [^o]er +SFX Y er ída oer +SFX Y er idos [^o]er +SFX Y er ídos oer +SFX Y er idas [^o]er +SFX Y er ídas oer +SFX Y r i er +SFX Y r a [^auü]ir +SFX Y ir ía [aü]ir +SFX Y ir ía [^g]uir +SFX Y r a guir +SFX Y r as [^auü]ir +SFX Y ir ías [aü]ir +SFX Y ir ías [^g]uir +SFX Y r as guir +SFX Y ir íamos ir +SFX Y ir íades ir +SFX Y r an [^auü]ir +SFX Y ir ían [aü]ir +SFX Y ir ían [^g]uir +SFX Y r an guir +SFX Y ir in ir +SFX Y r che [^auü]ir +SFX Y ir íche [aü]ir +SFX Y ir íche [^g]uir +SFX Y r che guir +SFX Y r u ir +SFX Y r stes [^auü]ir +SFX Y ir ístes [aü]ir +SFX Y ir ístes [^g]uir +SFX Y r stes guir +SFX Y 0 on [^auü]ir +SFX Y ir íron [aü]ir +SFX Y ir íron [^g]uir +SFX Y 0 on guir +SFX Y 0 a [^auü]ir +SFX Y ir íra [aü]ir +SFX Y ir íra [^g]uir +SFX Y 0 a guir +SFX Y 0 as [^auü]ir +SFX Y ir íras [aü]ir +SFX Y ir íras [^g]uir +SFX Y 0 as guir +SFX Y ir íramos ir +SFX Y ir írades ir +SFX Y 0 an [^auü]ir +SFX Y ir íran [aü]ir +SFX Y ir íran [^g]uir +SFX Y 0 an guir +SFX Y 0 ei ir +SFX Y 0 ás ir +SFX Y 0 á ir +SFX Y 0 emos ir +SFX Y 0 edes ir +SFX Y 0 án ir +SFX Y r se [^auü]ir +SFX Y ir íse [aü]ir +SFX Y ir íse [^g]uir +SFX Y r se guir +SFX Y r ses [^auü]ir +SFX Y ir íses [aü]ir +SFX Y ir íses [^g]uir +SFX Y r ses guir +SFX Y ir ísemos ir +SFX Y ir ísedes ir +SFX Y r sen [^auü]ir +SFX Y ir ísen [aü]ir +SFX Y ir ísen [^g]uir +SFX Y r sen guir +SFX Y 0 ia ir +SFX Y 0 ias ir +SFX Y 0 íamos ir +SFX Y 0 íades ir +SFX Y 0 ian ir +SFX Y 0 es [^auü]ir +SFX Y ir íres [aü]ir +SFX Y ir íres [^g]uir +SFX Y 0 es guir +SFX Y 0 mos ir +SFX Y 0 des ir +SFX Y 0 en [^auü]ir +SFX Y ir íren [aü]ir +SFX Y ir íren [^g]uir +SFX Y 0 en guir +SFX Y r ndo ir +SFX Y r do [^auü]ir +SFX Y ir ído [aü]ir +SFX Y ir ído [^g]uir +SFX Y r do guir +SFX Y r da [^auü]ir +SFX Y ir ída [aü]ir +SFX Y ir ída [^g]uir +SFX Y r da guir +SFX Y r dos [^auü]ir +SFX Y ir ídos [aü]ir +SFX Y ir ídos [^g]uir +SFX Y r dos guir +SFX Y r das [^auü]ir +SFX Y ir ídas [aü]ir +SFX Y ir ídas [^g]uir +SFX Y r das guir +SFX Y r 0 [^auü]ir +SFX Y ir í [aü]ir +SFX Y ir í [^g]uir +SFX Y r 0 guir + +SFX Z Y 166 +SFX Z er lo ler +SFX Z r s ler +SFX Z er e ler +SFX Z r mos ler +SFX Z r des ler +SFX Z r n ler +SFX Z er la ler +SFX Z er las ler +SFX Z er lamos ler +SFX Z er lades ler +SFX Z er lan ler +SFX Z er é ler +SFX Z edir ido edir +SFX Z ir es edir +SFX Z ir e edir +SFX Z r mos edir +SFX Z r des edir +SFX Z ir en edir +SFX Z edir ida edir +SFX Z edir idas edir +SFX Z edir idamos edir +SFX Z edir idades edir +SFX Z edir idan edir +SFX Z eguir igo eguir +SFX Z ir es eguir +SFX Z ir e eguir +SFX Z r mos eguir +SFX Z r des eguir +SFX Z ir en eguir +SFX Z eguir iga eguir +SFX Z eguir igas eguir +SFX Z eguir igamos eguir +SFX Z eguir igades eguir +SFX Z eguir igan eguir +SFX Z elir ilo elir +SFX Z ir es elir +SFX Z ir e elir +SFX Z r mos elir +SFX Z r des elir +SFX Z ir en elir +SFX Z elir ila elir +SFX Z elir ilas elir +SFX Z elir ilamos elir +SFX Z elir ilades elir +SFX Z elir ilan elir +SFX Z entir into entir +SFX Z ir es entir +SFX Z ir e entir +SFX Z r mos entir +SFX Z r des entir +SFX Z ir en entir +SFX Z entir inta entir +SFX Z entir intas entir +SFX Z entir intamos entir +SFX Z entir intades entir +SFX Z entir intan entir +SFX Z erir iro erir +SFX Z ir es erir +SFX Z ir e erir +SFX Z r mos erir +SFX Z r des erir +SFX Z ir en erir +SFX Z erir ira erir +SFX Z erir iras erir +SFX Z erir iramos erir +SFX Z erir irades erir +SFX Z erir iran erir +SFX Z ernir irno ernir +SFX Z ir es ernir +SFX Z ir e ernir +SFX Z r mos ernir +SFX Z r des ernir +SFX Z ir en ernir +SFX Z ernir irna ernir +SFX Z ernir irnas ernir +SFX Z ernir irnamos ernir +SFX Z ernir irnades ernir +SFX Z ernir irnan ernir +SFX Z ertir irto ertir +SFX Z ir es ertir +SFX Z ir e ertir +SFX Z r mos ertir +SFX Z r des ertir +SFX Z ir en ertir +SFX Z ertir irta ertir +SFX Z ertir irtas ertir +SFX Z ertir irtamos ertir +SFX Z ertir irtades ertir +SFX Z ertir irtan ertir +SFX Z estir isto estir +SFX Z ir es estir +SFX Z ir e estir +SFX Z r mos estir +SFX Z r des estir +SFX Z ir en estir +SFX Z estir ista estir +SFX Z estir istas estir +SFX Z estir istamos estir +SFX Z estir istades estir +SFX Z estir istan estir +SFX Z etir ito etir +SFX Z ir es etir +SFX Z ir e etir +SFX Z r mos etir +SFX Z r des etir +SFX Z ir en etir +SFX Z etir ita etir +SFX Z etir itas etir +SFX Z etir itamos etir +SFX Z etir itades etir +SFX Z etir itan etir +SFX Z obrir ubro obrir +SFX Z ir es obrir +SFX Z ir e obrir +SFX Z r mos obrir +SFX Z r des obrir +SFX Z ir en obrir +SFX Z obrir ubra obrir +SFX Z obrir ubras obrir +SFX Z obrir ubramos obrir +SFX Z obrir ubrades obrir +SFX Z obrir ubran obrir +SFX Z olir ulo olir +SFX Z ir es olir +SFX Z ir e olir +SFX Z r mos olir +SFX Z r des olir +SFX Z ir en olir +SFX Z olir ula olir +SFX Z olir ulas olir +SFX Z olir ulamos olir +SFX Z olir ulades olir +SFX Z olir ulan olir +SFX Z ir o ulir +SFX Z ulir oles ulir +SFX Z ulir ole ulir +SFX Z r mos ulir +SFX Z r des ulir +SFX Z ulir olen ulir +SFX Z ir a ulir +SFX Z ir as ulir +SFX Z ir amos ulir +SFX Z ir ades ulir +SFX Z ir an ulir +SFX Z ir o udir +SFX Z udir odes udir +SFX Z udir ode udir +SFX Z r mos udir +SFX Z r des udir +SFX Z udir oden udir +SFX Z ir a udir +SFX Z ir as udir +SFX Z ir amos udir +SFX Z ir ades udir +SFX Z ir an udir +SFX Z ir o uxir +SFX Z uxir oxes uxir +SFX Z uxir oxe uxir +SFX Z r mos uxir +SFX Z r des uxir +SFX Z uxir oxen uxir +SFX Z ir a uxir +SFX Z ir as uxir +SFX Z ir amos uxir +SFX Z ir ades uxir +SFX Z ir an uxir + +SFX K Y 410 +SFX K edir ido gredir +SFX K edir ides gredir +SFX K edir ide gredir +SFX K r mos gredir +SFX K r des gredir +SFX K edir iden gredir +SFX K edir ida gredir +SFX K edir idas gredir +SFX K edir idamos gredir +SFX K edir idades gredir +SFX K edir idan gredir +SFX K r o rir +SFX K r s rir +SFX K r mos rir +SFX K r des rir +SFX K r en rir +SFX K r amos rir +SFX K r ades rir +SFX K cer go dicer +SFX K cer s dicer +SFX K icer i dicer +SFX K r mos dicer +SFX K r des dicer +SFX K cer n dicer +SFX K cer xen dicer +SFX K cer xeche dicer +SFX K cer xo dicer +SFX K cer xemos dicer +SFX K cer xestes dicer +SFX K cer xeron dicer +SFX K cer rei dicer +SFX K cer rás dicer +SFX K cer rá dicer +SFX K cer remos dicer +SFX K cer redes dicer +SFX K cer rán dicer +SFX K er ia dicer +SFX K er ias dicer +SFX K er íamos dicer +SFX K er íades dicer +SFX K er ian dicer +SFX K cer xera dicer +SFX K cer xeras dicer +SFX K cer xéramos dicer +SFX K cer xérades dicer +SFX K cer xeran dicer +SFX K cer ria dicer +SFX K cer rias dicer +SFX K cer ríamos dicer +SFX K cer ríades dicer +SFX K cer rian dicer +SFX K cer ga dicer +SFX K cer gas dicer +SFX K cer gamos dicer +SFX K cer gades dicer +SFX K cer gan dicer +SFX K cer xese dicer +SFX K cer xeses dicer +SFX K cer xésemos dicer +SFX K cer xésedes dicer +SFX K cer xesen dicer +SFX K cer xer dicer +SFX K cer xeres dicer +SFX K cer xermos dicer +SFX K cer xerdes dicer +SFX K cer xeren dicer +SFX K 0 es dicer +SFX K 0 mos dicer +SFX K 0 des dicer +SFX K 0 en dicer +SFX K r ndo dicer +SFX K cer to dicer +SFX K cer ta dicer +SFX K cer tos dicer +SFX K cer tas dicer +SFX K r i dicer +SFX K er é dicer +SFX K cer go facer +SFX K cer i facer +SFX K r mos facer +SFX K r des facer +SFX K acer ixen facer +SFX K acer ixeche facer +SFX K acer ixo facer +SFX K acer ixemos facer +SFX K acer ixestes facer +SFX K acer ixeron facer +SFX K cer rei facer +SFX K cer rás facer +SFX K cer rá facer +SFX K cer remos facer +SFX K cer redes facer +SFX K cer rán facer +SFX K er ia facer +SFX K er ias facer +SFX K er íamos facer +SFX K er íades facer +SFX K er ian facer +SFX K acer ixera facer +SFX K acer ixeras facer +SFX K acer ixéramos facer +SFX K acer ixérades facer +SFX K acer ixeran facer +SFX K cer ria facer +SFX K cer rias facer +SFX K cer ríamos facer +SFX K cer ríades facer +SFX K cer rian facer +SFX K cer ga facer +SFX K cer gas facer +SFX K cer gamos facer +SFX K cer gades facer +SFX K cer gan facer +SFX K acer ixese facer +SFX K acer ixeses facer +SFX K acer ixésemos facer +SFX K acer ixésedes facer +SFX K acer ixesen facer +SFX K acer ixer facer +SFX K acer ixeres facer +SFX K acer ixermos facer +SFX K acer ixerdes facer +SFX K acer ixeren facer +SFX K 0 es facer +SFX K 0 mos facer +SFX K 0 des facer +SFX K 0 en facer +SFX K r ndo facer +SFX K acer eito facer +SFX K acer eita facer +SFX K acer eitos facer +SFX K acer eitas facer +SFX K r i facer +SFX K er é facer +SFX K r ño or +SFX K or ós or +SFX K r mos or +SFX K r ndes or +SFX K r ñen or +SFX K or uxen or +SFX K or uxeche or +SFX K or uxo or +SFX K or uxemos or +SFX K or uxestes or +SFX K or uxeron or +SFX K 0 ei or +SFX K 0 ás or +SFX K 0 á or +SFX K 0 emos or +SFX K 0 edes or +SFX K 0 án or +SFX K or uña or +SFX K or uñas or +SFX K or úñamos or +SFX K or úñades or +SFX K or uñan or +SFX K or uxera or +SFX K or uxeras or +SFX K or uxéramos or +SFX K or uxérades or +SFX K or uxeran or +SFX K 0 ia or +SFX K 0 ias or +SFX K 0 íamos or +SFX K 0 íades or +SFX K 0 ian or +SFX K r ña or +SFX K r ñas or +SFX K r ñamos or +SFX K r ñades or +SFX K r ñan or +SFX K or uxese or +SFX K or uxeses or +SFX K or uxésemos or +SFX K or uxésedes or +SFX K or uxesen or +SFX K or uxer or +SFX K or uxeres or +SFX K or uxermos or +SFX K or uxerdes or +SFX K or uxeren or +SFX K 0 es or +SFX K 0 mos or +SFX K 0 des or +SFX K 0 en or +SFX K r ndo or +SFX K r sto or +SFX K r sta or +SFX K r stos or +SFX K r stas or +SFX K r nde or +SFX K or ó or +SFX K cer z pracer +SFX K acer ouven pracer +SFX K acer ouveche pracer +SFX K acer ouvo pracer +SFX K acer ouvemos pracer +SFX K acer ouvestes pracer +SFX K acer ouveron pracer +SFX K 0 ei pracer +SFX K 0 ás pracer +SFX K 0 á pracer +SFX K 0 emos pracer +SFX K 0 edes pracer +SFX K 0 án pracer +SFX K er ia pracer +SFX K er ias pracer +SFX K er íamos pracer +SFX K er íades pracer +SFX K er ian pracer +SFX K acer ouvera pracer +SFX K acer ouveras pracer +SFX K acer ouvéramos pracer +SFX K acer ouvérades pracer +SFX K acer ouveran pracer +SFX K 0 ia pracer +SFX K 0 ias pracer +SFX K 0 íamos pracer +SFX K 0 íades pracer +SFX K 0 ian pracer +SFX K acer ouvese pracer +SFX K acer ouveses pracer +SFX K acer ouvésemos pracer +SFX K acer ouvésedes pracer +SFX K acer ouvesen pracer +SFX K acer ouver pracer +SFX K acer ouveres pracer +SFX K acer ouvermos pracer +SFX K acer ouverdes pracer +SFX K acer ouveren pracer +SFX K 0 es pracer +SFX K 0 mos pracer +SFX K 0 des pracer +SFX K 0 en pracer +SFX K r ndo pracer +SFX K er ido pracer +SFX K er ida pracer +SFX K er idos pracer +SFX K er idas pracer +SFX K r i pracer +SFX K er é pracer +SFX K r ño ter +SFX K r mos ter +SFX K r ndes ter +SFX K r ñen ter +SFX K er iven ter +SFX K er iveche ter +SFX K er ivo ter +SFX K er ivemos ter +SFX K er ivestes ter +SFX K er iveron ter +SFX K 0 ei ter +SFX K 0 ás ter +SFX K 0 á ter +SFX K 0 emos ter +SFX K 0 edes ter +SFX K 0 án ter +SFX K er iña ter +SFX K er iñas ter +SFX K er íñamos ter +SFX K er íñades ter +SFX K er iñan ter +SFX K er ivera ter +SFX K er iveras ter +SFX K er ivéramos ter +SFX K er ivérades ter +SFX K er iveran ter +SFX K 0 ia ter +SFX K 0 ias ter +SFX K 0 íamos ter +SFX K 0 íades ter +SFX K 0 ian ter +SFX K r ña ter +SFX K r ñas ter +SFX K r ñamos ter +SFX K r ñades ter +SFX K r ñan ter +SFX K er ivese ter +SFX K er iveses ter +SFX K er ivésemos ter +SFX K er ivésedes ter +SFX K er ivesen ter +SFX K er iver ter +SFX K er iveres ter +SFX K er ivermos ter +SFX K er iverdes ter +SFX K er iveren ter +SFX K 0 es ter +SFX K 0 mos ter +SFX K 0 des ter +SFX K 0 en ter +SFX K r ndo ter +SFX K er ido ter +SFX K er ida ter +SFX K er idos ter +SFX K er idas ter +SFX K r nde ter +SFX K er é ter +SFX K r xo ver +SFX K r mos ver +SFX K r des ver +SFX K er in ver +SFX K er iche ver +SFX K er iu ver +SFX K er imos ver +SFX K er istes ver +SFX K er iron ver +SFX K 0 ei ver +SFX K 0 ás ver +SFX K 0 á ver +SFX K 0 emos ver +SFX K 0 edes ver +SFX K 0 án ver +SFX K er ia ver +SFX K er ias ver +SFX K er íamos ver +SFX K er íades ver +SFX K er ian ver +SFX K er ira ver +SFX K er iras ver +SFX K er íramos ver +SFX K er írades ver +SFX K er iran ver +SFX K 0 ia ver +SFX K 0 ias ver +SFX K 0 íamos ver +SFX K 0 íades ver +SFX K 0 ian ver +SFX K r xa ver +SFX K r xas ver +SFX K r xamos ver +SFX K r xades ver +SFX K r xan ver +SFX K er ise ver +SFX K er ises ver +SFX K er ísemos ver +SFX K er ísedes ver +SFX K er isen ver +SFX K er ir ver +SFX K er ires ver +SFX K er irmos ver +SFX K er irdes ver +SFX K er iren ver +SFX K 0 es ver +SFX K 0 mos ver +SFX K 0 des ver +SFX K 0 en ver +SFX K r ndo ver +SFX K er isto ver +SFX K er ista ver +SFX K er istos ver +SFX K er istas ver +SFX K r de ver +SFX K er é ver +SFX K ir eño vir +SFX K r mos vir +SFX K r ndes vir +SFX K ir eñen vir +SFX K r n vir +SFX K r ñeche vir +SFX K ir eu vir +SFX K r ñemos vir +SFX K r ñestes vir +SFX K r ñeron vir +SFX K 0 ei vir +SFX K 0 ás vir +SFX K 0 á vir +SFX K 0 emos vir +SFX K 0 edes vir +SFX K 0 án vir +SFX K r ña vir +SFX K r ñas vir +SFX K ir íñamos vir +SFX K ir íñades vir +SFX K r ñan vir +SFX K r ñera vir +SFX K r ñeras vir +SFX K r ñéramos vir +SFX K r ñérades vir +SFX K r ñeran vir +SFX K 0 ia vir +SFX K 0 ias vir +SFX K 0 íamos vir +SFX K 0 íades vir +SFX K 0 ian vir +SFX K ir eña vir +SFX K ir eñas vir +SFX K ir eñamos vir +SFX K ir eñades vir +SFX K ir eñan vir +SFX K r ñese vir +SFX K r ñeses vir +SFX K r ñésemos vir +SFX K r ñésedes vir +SFX K r ñesen vir +SFX K r ñer vir +SFX K r ñeres vir +SFX K r ñermos vir +SFX K r ñerdes vir +SFX K r ñeren vir +SFX K 0 es vir +SFX K 0 mos vir +SFX K 0 des vir +SFX K 0 en vir +SFX K r ndo vir +SFX K r do vir +SFX K r da vir +SFX K r dos vir +SFX K r das vir +SFX K r nde vir + +REP 8 +REP ci z +REP z ci +REP cc c +REP c cc +REP able ábel +REP ible íbel +REP ict it +REP uct ut ADDED aspell/dict-fossil/fossil_phonet.dat Index: aspell/dict-fossil/fossil_phonet.dat ================================================================== --- /dev/null +++ aspell/dict-fossil/fossil_phonet.dat @@ -0,0 +1,1 @@ +version 1.0 ADDED aspell/en_FOSSIL.pws Index: aspell/en_FOSSIL.pws ================================================================== --- /dev/null +++ aspell/en_FOSSIL.pws @@ -0,0 +1,2 @@ +personal_ws-1.1 en 0 +SQL ADDED aspell/en_FOSSIL.rws Index: aspell/en_FOSSIL.rws ================================================================== --- /dev/null +++ aspell/en_FOSSIL.rws cannot compute difference between binary files ADDED aspell/en_FOSSIL.wl Index: aspell/en_FOSSIL.wl ================================================================== --- /dev/null +++ aspell/en_FOSSIL.wl @@ -0,0 +1,2 @@ +blob_appendf +blob_appendf() ADDED aspell/en_FOSSIL_affix.dat Index: aspell/en_FOSSIL_affix.dat ================================================================== --- /dev/null +++ aspell/en_FOSSIL_affix.dat @@ -0,0 +1,3 @@ +# +SET ISO8859-1 +TRY aersiondctmlubpágfvxzíhqñóéúüwkyj_() ADDED aspell/en_FOSSIL_phonet.dat Index: aspell/en_FOSSIL_phonet.dat ================================================================== --- /dev/null +++ aspell/en_FOSSIL_phonet.dat @@ -0,0 +1,1 @@ +version 1.0 ADDED aspell/example-c Index: aspell/example-c ================================================================== --- /dev/null +++ aspell/example-c cannot compute difference between binary files ADDED aspell/example-c.c Index: aspell/example-c.c ================================================================== --- /dev/null +++ aspell/example-c.c @@ -0,0 +1,337 @@ +/* This file is part of The New Aspell + * Copyright (C) 2000-2001 by Kevin Atkinson under the GNU LGPL + * license version 2.0 or 2.1. You should have received a copy of the + * LGPL license along with this library if you did not you can find it + * at http://www.gnu.org/. +*/ + +#include +#include +#include + +#include "aspell.h" + +static void print_word_list(AspellSpeller * speller, + const AspellWordList *wl, + char delem) +{ + if (wl == 0) { + printf("Error: %s\n", aspell_speller_error_message(speller)); + } else { + AspellStringEnumeration * els = aspell_word_list_elements(wl); + const char * word; + while ( (word = aspell_string_enumeration_next(els)) != 0) { + fputs(word, stdout); + putc(delem, stdout); + } + } +} + +#define check_for_error(speller) \ + if (aspell_speller_error(speller) != 0) { \ + printf("Error: %s\n", aspell_speller_error_message(speller)); \ + break; \ + } + +#define check_for_config_error(config) \ + if (aspell_config_error(config) != 0) { \ + printf("Error: %s\n", aspell_config_error_message(config)); \ + break; \ + } + +static void check_document(AspellSpeller * speller, const char * file); + +int main(int argc, const char *argv[]) +{ + AspellCanHaveError * ret; + AspellSpeller * speller; + int have; + char word[81]; + char * p; + char * word_end; + AspellConfig * config; + + if (argc < 2) { + printf("Usage: %s [|- [[|- []]]\n", argv[0]); + return 1; + } + + config = new_aspell_config(); + + aspell_config_replace(config, "lang", argv[1]); + + if (argc >= 3 && argv[2][0] != '-' && argv[2][1] != '\0') + aspell_config_replace(config, "size", argv[2]); + + if (argc >= 4 && argv[3][0] != '-') + aspell_config_replace(config, "jargon", argv[3]); + + if (argc >= 5 && argv[4][0] != '-') + aspell_config_replace(config, "encoding", argv[4]); + + ret = new_aspell_speller(config); + + delete_aspell_config(config); + + if (aspell_error(ret) != 0) { + printf("Error: %s\n",aspell_error_message(ret)); + delete_aspell_can_have_error(ret); + return 2; + } + speller = to_aspell_speller(ret); + config = aspell_speller_config(speller); + + fputs("Using: ", stdout); + fputs(aspell_config_retrieve(config, "lang"), stdout); + fputs("-", stdout); + fputs(aspell_config_retrieve(config, "jargon"), stdout); + fputs("-", stdout); + fputs(aspell_config_retrieve(config, "size"), stdout); + fputs("-", stdout); + fputs(aspell_config_retrieve(config, "module"), stdout); + fputs("\n\n", stdout); + + puts("Type \"h\" for help.\n"); + + while (fgets(word, 80, stdin) != 0) { + + /* remove trailing spaces */ + + word_end = strchr(word, '\0') - 1; + while (word_end != word && (*word_end == '\n' || *word_end == ' ')) + --word_end; + ++word_end; + *word_end = '\0'; + + putchar('\n'); + switch (word[0]) { + case '\0': + break; + case 'h': + puts( + "Usage: \n" + " h(elp) help\n" + " c check if a word is the correct spelling\n" + " s print out a list of suggestions for a word\n" + " a add a word to the personal word list\n" + " i ignore a word for the rest of the session\n" + " d spell checks a document\n" + " p dumps the personal word list\n" + " P dumps the session word list\n" + " m dumps the main word list\n" + " o