Compare commits

...
Sign in to create a new pull request.

2 commits

Author SHA1 Message Date
Caolán McNamara
03471fe354 update to latest version
upstream en_GB is dead so take live scowl as upstream, but take
the ize+ise dictionary
2017-09-28 13:21:31 +01:00
Caolán McNamara
9510e475c7 Resolves: rhbz#1494968 perl regex rule changes result in broken en_US dict 2017-09-25 12:05:29 +01:00
16 changed files with 77 additions and 312 deletions

1
.gitignore vendored
View file

@ -1,2 +1,3 @@
/en_GB.zip
/rel-2014.08.11.1.tar.gz
/scowl-2017.08.24.tar.gz

View file

@ -1,56 +0,0 @@
--- wordlist.orig/en_GB.dic 2005-05-26 11:49:30.000000000 +0100
+++ wordlist/en_GB.dic 2008-11-28 10:01:37.000000000 +0000
@@ -1,4 +1,4 @@
-46280
+46286
abaft
abbreviation/M
abdicate/DNGSn
@@ -2278,6 +2278,7 @@
hysterectomy/SM
Hyundai/M
ICC/M
+i
icebox/SM
icicle/SM
iconoclasm/MS
@@ -2470,6 +2471,7 @@
Ithacan
its
ix
+j
jackknife/DGMS
Jacqueline
Jaeger/M
@@ -2764,6 +2766,7 @@
Mabel/M
Macedon
Macedonia/M
+m
macintosh/SM
MacIntyre
Mackenzie
@@ -3222,6 +3225,7 @@
nuttiness/S
nymphomaniac/S
Oakland/M
+o
ob.
obeyer/EM
obfuscation/M
@@ -4797,6 +4801,7 @@
tyro/SM
UFO/S
Ukrainian/S
+u
ulcerate/SGNDn
ulcerous
Ulrika/M
@@ -4962,6 +4967,7 @@
vulnerability/SI
vulva/M
WAC
+w
wagon/SM
waitress/MS
Waldemar/M

View file

@ -1,11 +0,0 @@
--- wordlist.orig/en_GB.dic 2017-09-21 11:24:10.673817684 +0100
+++ wordlist/en_GB.dic 2017-09-21 11:31:39.596987079 +0100
@@ -27307,7 +27307,7 @@
estimations/f
estrange/DGLS
estranger/M
-etc.
+etc
eternal/PY
ethereal/PY
ethic/3MSY

View file

@ -1,20 +0,0 @@
--- wordlist.orig/en_GB.dic 2009-06-06 15:16:16.000000000 +0100
+++ wordlist/en_GB.dic 2009-06-06 15:17:28.000000000 +0100
@@ -19953,7 +19953,7 @@
technology/3wSM1
Ted/M
tee/SGdM
-TEirtza/M
+Teirtza/M
tellurium/M
temp/GMRSTD
tempera/MLS
@@ -41226,7 +41226,7 @@
adore/lRSNnGkD
Adrian/M
adroit/TYP
-ADte
+ADTe
adulterer/SM
adumbration/M
advantageousness/E

View file

@ -1,16 +0,0 @@
--- wordlist/en_GB.dic.orig 2012-04-10 22:51:16.471732570 +0100
+++ wordlist/en_GB.dic 2012-04-10 22:55:00.018995514 +0100
@@ -1,4 +1,4 @@
-46285
+46286
abaft
abbreviation/M
abdicate/DNGSn
@@ -22588,6 +22588,7 @@
halter/d
halyard/MS
Hamal/M
+hames
Hamlin/M
hamper/dS
handbag/SMDG

View file

@ -1,8 +0,0 @@
--- scowl/speller/en.dic.supp.orig 2008-11-28 10:10:01.000000000 +0000
+++ scowl/speller/en.dic.supp 2008-11-28 10:10:10.000000000 +0000
@@ -21,3 +21,5 @@
7th/pt
8th/pt
9th/pt
+e.g.
+i.e.

View file

@ -1,9 +1,9 @@
--- scowl/r/special/abbreviations 2008-11-29 23:02:41.000000000 +0000
+++ scowl/r/special/abbreviations 2010-07-30 11:18:33.000000000 +0100
@@ -3,6 +3,11 @@
API
ATM
BTW
@@ -6,6 +6,11 @@
DVR
DVR's
DVRs
+EB
+Eb
+Ei
@ -12,41 +12,40 @@
EULA
EULAs
FAQ
@@ -16,39 +21,75 @@
GIF
GPU
GUI
@@ -15,6 +20,10 @@
FUD
FWIW
FYI
+Gb
+Gi
+GiB
+Gib
GnuPG
HDD
HDMI
GB
GHz
GIF
@@ -26,6 +35,7 @@
HTML
HTTP
IDE
+IEC
IEEE
IMHO
IMNSHO
IMO
@@ -34,7 +44,13 @@
IRC
ISO
ISS
+JEDEC
JPEG
+KB
+Kb
+Ki
+KiB
+Kib
JPEG
MB
MP3
MP3s
MPEG
Mb
+Mi
+MiB
+Mib
OTOH
@@ -44,12 +60,22 @@
PCMCIA
PDF
PGP
@ -55,10 +54,12 @@
+Pi
+PiB
+Pib
QA
RFC
ROFL
RTFM
SQL
Sep
+TB
+Tb
+Ti
@ -66,7 +67,8 @@
+Tib
URL
USB
USB
UTC
@@ -57,10 +83,22 @@
WWW
WYSIWYG
XML
@ -81,10 +83,11 @@
+Zi
+ZiB
+Zib
bpm
cf
dpi
+kB
+kb
pp
resp
vs
vols

View file

@ -1,13 +1,3 @@
--- en_GB.aff 2010-04-15 14:51:13.000000000 +0100
+++ en_GB.aff 2010-04-15 14:52:07.000000000 +0100
@@ -6,6 +6,7 @@
# R 1.18, 11/04/05
SET ISO8859-1
TRY esiaénrtolcdugmfphbyvkw-'.zqjxSNRTLCGDMFPHBEAUYOIVKWóöâôZQJXÅçèîêàïüäñ
+WORDCHARS 0123456789'
REP 27
REP f ph
REP ph f
--- scowl/speller/en.aff 2010-04-15 14:56:37.000000000 +0100
+++ scowl/speller/en.aff 2010-04-15 14:57:08.000000000 +0100
@@ -12,7 +12,7 @@

View file

@ -1,37 +0,0 @@
--- wordlist/en_GB.dic 2011-02-08 11:43:16.730271377 +0000
+++ wordlist/en_GB.dic 2011-02-08 11:43:35.261482189 +0000
@@ -1,4 +1,4 @@
-46286
+46285
abaft
abbreviation/M
abdicate/DNGSn
@@ -643,7 +643,6 @@
Calder
caldera/SM
caldron's
-calender/dMS
calibrate/SAGDN
calibrater's
calico/M
--- wordlist/alt12dicts/2of12full.txt 2011-02-08 11:55:46.478837084 +0000
+++ wordlist/alt12dicts/2of12full.txt 2011-02-08 11:59:12.455202478 +0000
@@ -6468,7 +6468,6 @@
12: 12 -# -& calendar
2: 1 1# -& calendar month
2: 1 1# -& calendar year
- 4: 4 -# -& calender
12: 12 -# -& calf
9: 8 1# -& calfskin
3: 3 -# -& Calgary
diff -ru wordlist/alt12dicts/5desk.txt wordlist/alt12dicts/5desk.txt
--- wordlist/alt12dicts/5desk.txt 2011-02-08 11:55:46.548837888 +0000
+++ wordlist/alt12dicts/5desk.txt 2011-02-08 11:58:55.162003701 +0000
@@ -7353,7 +7353,6 @@
Caledonia
Caledonian
calendar
-calender
calendric
calendrical
calends

View file

@ -1,9 +0,0 @@
--- wordlist/scowl/README.in 2014-10-08 14:00:05.682971887 +0100
+++ wordlist/scowl/README.in 2014-10-08 14:00:12.369050485 +0100
@@ -1,6 +1,4 @@
Spell Checking Oriented Word Lists (SCOWL)
-@`if [ "$SCOWL_VERSION" ]; then echo -n "Version $SCOWL_VERSION"; fi`
-@`git log --pretty=format:'%cd [%h]' -n 1 --`
by Kevin Atkinson (kevina@gnu.org)
The SCOWL is a collection of word lists split up in various sizes, and

View file

@ -1,10 +0,0 @@
diff -ru wordlist/scowl/src/deaccent.hh wordlist/scowl/src/deaccent.hh
--- wordlist/scowl/src/deaccent.hh
+++ wordlist/scowl/src/deaccent.hh
@@ -1,5 +1,5 @@
-static const char deaccent_lookup[256] = {
+static const unsigned char deaccent_lookup[256] = {
0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20,
21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38,
39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56,

View file

@ -1,18 +0,0 @@
--- wordlist.orig/scowl/src/add-affixes 2012-10-11 13:05:58.864864580 +0100
+++ wordlist/scowl/src/add-affixes 2012-10-11 14:11:05.144908897 +0100
@@ -74,6 +74,15 @@
@a = grep {not $remove{"$w:$p:$_"}} @a;
next unless @a;
$lookup{$w} .= join("\n",@a)."\n";
+ next unless $p eq 'N';
+
+ # For irregular nouns that have plurals that do not end in s
+ # then add the possessive form of the plural as well
+ foreach (@a) {
+ next unless (substr($_,-1,1) ne 's');
+ $possessive{$_} = "$_\'s\n";
+ $lookup{$w} .= $possessive{$_};
+ }
}
unless ($no_possessive) {

View file

@ -1,34 +1,20 @@
Name: hunspell-en
Summary: English hunspell dictionaries
%global upstreamid 20140811.1
%global upstreamid 20170824
Version: 0.%{upstreamid}
Release: 8%{?dist}
Source0: https://github.com/kevina/wordlist/archive/rel-2014.08.11.1.tar.gz
Source1: http://en-gb.pyxidium.co.uk/dictionary/en_GB.zip
#See http://mxr.mozilla.org/mozilla/source/extensions/spellcheck/locales/en-US/hunspell/mozilla_words.diff?raw=1
Patch0: mozilla_words.patch
Patch1: en_GB-singleletters.patch
Patch2: en_GB.two_initial_caps.patch
#See http://sourceforge.net/tracker/?func=detail&aid=2355344&group_id=10079&atid=1014602
#filter removes words with "." in them
Patch3: en_US-strippedabbrevs.patch
#See https://sourceforge.net/tracker/?func=detail&aid=2987192&group_id=143754&atid=756397
#to allow "didn't" instead of suggesting change to typographical apostrophe
Patch4: hunspell-en-allow-non-typographical.marks.patch
#See https://sourceforge.net/tracker/?func=detail&aid=3012183&group_id=10079&atid=1014602
#See https://bugzilla.redhat.com/show_bug.cgi?id=619577 add SI and IEC prefixes
Patch5: hunspell-en-SI_and_IEC.patch
#See https://sourceforge.net/tracker/?func=detail&aid=3175662&group_id=10079&atid=1014602 obscure Calender hides misspelling of Calendar
Patch6: hunspell-en-calender.patch
#valid English words that are archaic or rare in en-GB but not in en-IE
Patch7: en_IE.supplemental.patch
#call git to get the release hash, but this is a tarball
Patch8: hunspell-en-dont-call-git-during-build.patch
#fix build
Patch9: hunspell-en-fixbuild.patch
#rhbz#1492306 for better or worse treat etc the same in US and GB
Patch10: en_GB.etc.patch
URL: http://wordlist.sourceforge.net/
Release: 1%{?dist}
Source0: http://downloads.sourceforge.net/wordlist/scowl-2017.08.24.tar.gz
#prefer both ize and ise for en_GB
Patch0: hunspell-en_GB-both.patch
##See https://sourceforge.net/tracker/?func=detail&aid=2987192&group_id=143754&atid=756397
##to allow "didn't" instead of suggesting change to typographical apostrophe
Patch1: hunspell-en-allow-non-typographical.marks.patch
##See http://mxr.mozilla.org/mozilla/source/extensions/spellcheck/locales/en-US/hunspell/mozilla_words.diff?raw=1
Patch2: mozilla_words.patch
##See https://sourceforge.net/tracker/?func=detail&aid=3012183&group_id=10079&atid=1014602
##See https://bugzilla.redhat.com/show_bug.cgi?id=619577 add SI and IEC prefixes
Patch3: hunspell-en-SI_and_IEC.patch
URL: http://wordlist.aspell.net/
License: LGPLv2+ and LGPLv2 and BSD
BuildArch: noarch
BuildRequires: aspell, zip, dos2unix, perl-Getopt-Long
@ -55,44 +41,20 @@ Summary: UK English hunspell dictionaries
UK English hunspell dictionaries
%prep
%setup -q -n wordlist-rel-2014.08.11.1
%setup -q -T -D -a 1 -n wordlist-rel-2014.08.11.1
%patch0 -p0 -b .mozilla
%patch1 -p1 -b .singleletters
%patch2 -p1 -b .two_initial_cap
%patch3 -p0 -b .strippedabbrevs
%patch4 -p0 -b .allow-non-typographical
%patch5 -p0 -b .SI_and_IEC
%patch6 -p1 -b .calender
%patch7 -p1 -b .en_IE
%patch8 -p1 -b .nogit
%patch9 -p1 -b .fixbuild
%patch10 -p1 -b .etc
%autosetup -p1 -n scowl-2017.08.24
%build
export PERL5LIB=`pwd`/scowl/r/varcon${PERL5LIB:+:${PERL5LIB}}
make
cd scowl/speller
cd speller
make hunspell
for i in README_en_CA.txt README_en_US.txt; do
if ! iconv -f utf-8 -t utf-8 -o /dev/null $i > /dev/null 2>&1; then
iconv -f ISO-8859-1 -t UTF-8 $i > $i.new
touch -r $i $i.new
mv -f $i.new $i
fi
tr -d '\r' < $i > $i.new
touch -r $i $i.new
mv -f $i.new $i
done
%install
mkdir -p $RPM_BUILD_ROOT/%{_datadir}/myspell
cp -p en_??.dic en_??.aff $RPM_BUILD_ROOT/%{_datadir}/myspell
cd scowl/speller
cd speller
cp -p en_??.dic en_??.aff $RPM_BUILD_ROOT/%{_datadir}/myspell
pushd $RPM_BUILD_ROOT/%{_datadir}/myspell/
en_GB_aliases="en_AG en_AU en_BS en_BW en_BZ en_DK en_GH en_HK en_IE en_IN en_JM en_MW en_NA en_NG en_NZ en_SG en_TT en_ZA en_ZM en_ZW"
en_GB_aliases="en_AG en_BS en_BW en_BZ en_DK en_GH en_HK en_IE en_IN en_JM en_MW en_NA en_NG en_NZ en_SG en_TT en_ZA en_ZM en_ZW"
for lang in $en_GB_aliases; do
ln -s en_GB.aff $lang.aff
ln -s en_GB.dic $lang.dic
@ -106,20 +68,27 @@ popd
%files
%doc scowl/speller/README_en_CA.txt
%doc speller/README_en_AU.txt
%doc speller/README_en_CA.txt
%{_datadir}/myspell/*
%exclude %{_datadir}/myspell/en_GB.*
%exclude %{_datadir}/myspell/en_US.*
%files US
%doc scowl/speller/README_en_US.txt
%doc speller/README_en_US.txt
%{_datadir}/myspell/en_US.*
%files GB
%doc README_en_GB.txt
%doc speller/README_en_GB.txt
%{_datadir}/myspell/en_GB.*
%changelog
* Thu Sep 28 2017 Caolán McNamara <caolanm@redhat.com> - 0.20170824-1
- update to latest version
* Mon Sep 25 2017 Caolán McNamara <caolanm@redhat.com> - 0.20140811.1-9
- Resolves: rhbz#1494968 perl regex rule changes result in broken en_US dict
* Thu Sep 21 2017 Caolán McNamara <caolanm@redhat.com> - 0.20140811.1-8
- Resolves: rhbz#1492306 for better or worse treat etc the same in US and GB

12
hunspell-en_GB-both.patch Normal file
View file

@ -0,0 +1,12 @@
--- scowl-2017.08.24/speller/make-hunspell-dict.orig 2017-09-28 12:51:24.814098514 +0100
+++ scowl-2017.08.24/speller/make-hunspell-dict 2017-09-28 12:51:50.645899464 +0100
@@ -85,8 +85,7 @@
doit en_US "mk-list --accents=strip en_US $SIZE"
doit en_CA "mk-list --accents=strip en_CA $SIZE"
- doit en_GB-ize "mk-list --accents=strip en_GB-ize $SIZE"
- doit en_GB-ise "mk-list --accents=strip en_GB-ise $SIZE"
+ doit en_GB "mk-list --accents=strip en_GB-ize en_GB-ise $SIZE"
doit en_AU "mk-list --accents=strip en_AU $SIZE"
doit en_US-large "mk-list -v1 --accents=both en_US 70"

View file

@ -1,66 +1,42 @@
--- scowl/r/special/proper-names 2008-02-08 11:53:27.000000000 +0000
+++ scowl/r/special/proper-names 2008-02-08 12:06:00.000000000 +0000
@@ -8,6 +8,8 @@
@@ -9,6 +9,8 @@
BitTorrent
Bluetooth
Bugzilla
+Camino
+ChatZilla
CVS
Closure
Debian
Dropbox
@@ -19,23 +21,28 @@
GNU
Cressida
@@ -24,6 +26,7 @@
Gaia
Gentoo
GitHub
+Haskell
Hunspell
IKEA
ISO
Instagram
@@ -31,6 +34,7 @@
Ispell
LibreOffice
LyX
+Mandriva
Mozilla
+MySpell
NSA
Netflix
OpenOffice
@@ -38,12 +42,14 @@
PayPal
PowerPC
Roku
+SeaMonkey
SUSE
SVN
Scala
Seeger
Slackware
Sourceforge
+Sunbird
Thunderbird
Troilus
Twitter
Ubuntu
--- scowl/speller/en.aff 2008-02-08 20:28:24.000000000 +0000
+++ scowl/speller/en.aff 2008-02-08 20:28:45.000000000 +0000
@@ -110,13 +110,17 @@
SFX L Y 1
SFX L 0 ment .
-REP 88
+SFX i N 1
+SFX i us i us
+
+REP 90
REP a ei
REP ei a
REP a ey
REP ey a
REP ai ie
REP ie ai
+REP alot a_lot
REP are air
REP are ear
REP are eir
@@ -199,3 +203,4 @@
REP shun tion
REP shun sion
REP shun cion
+REP sitted sat

View file

@ -1,2 +1 @@
218909136738f4564b81ecd145ade6ee en_GB.zip
b39e3879a5f2e20eaebffb8fa6d24f5e rel-2014.08.11.1.tar.gz
SHA512 (scowl-2017.08.24.tar.gz) = 0c8a9d8ca55cd757d7074c7ed8a0f82b6e7fe19aa1e6a217efa895ce4306a7638e1324df7de0ee3e1004026ae8164f6148c48fe59dc8722750a627e7fdcb049b