Spaces:
Sleeping
Sleeping
File size: 1,949 Bytes
d916065 |
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 |
.. Copyright (C) 2001-2023 NLTK Project
.. For license information, see LICENSE.TXT
===============
Grammar Parsing
===============
Grammars can be parsed from strings:
>>> from nltk import CFG
>>> grammar = CFG.fromstring("""
... S -> NP VP
... PP -> P NP
... NP -> Det N | NP PP
... VP -> V NP | VP PP
... Det -> 'a' | 'the'
... N -> 'dog' | 'cat'
... V -> 'chased' | 'sat'
... P -> 'on' | 'in'
... """)
>>> grammar
<Grammar with 14 productions>
>>> grammar.start()
S
>>> grammar.productions()
[S -> NP VP, PP -> P NP, NP -> Det N, NP -> NP PP, VP -> V NP, VP -> VP PP,
Det -> 'a', Det -> 'the', N -> 'dog', N -> 'cat', V -> 'chased', V -> 'sat',
P -> 'on', P -> 'in']
Probabilistic CFGs:
>>> from nltk import PCFG
>>> toy_pcfg1 = PCFG.fromstring("""
... S -> NP VP [1.0]
... NP -> Det N [0.5] | NP PP [0.25] | 'John' [0.1] | 'I' [0.15]
... Det -> 'the' [0.8] | 'my' [0.2]
... N -> 'man' [0.5] | 'telescope' [0.5]
... VP -> VP PP [0.1] | V NP [0.7] | V [0.2]
... V -> 'ate' [0.35] | 'saw' [0.65]
... PP -> P NP [1.0]
... P -> 'with' [0.61] | 'under' [0.39]
... """)
Chomsky Normal Form grammar (Test for bug 474)
>>> g = CFG.fromstring("VP^<TOP> -> VBP NP^<VP-TOP>")
>>> g.productions()[0].lhs()
VP^<TOP>
Grammars can contain both empty strings and empty productions:
>>> from nltk.grammar import CFG
>>> from nltk.parse.generate import generate
>>> grammar = CFG.fromstring("""
... S -> A B
... A -> 'a'
... # An empty string:
... B -> 'b' | ''
... """)
>>> list(generate(grammar))
[['a', 'b'], ['a', '']]
>>> grammar = CFG.fromstring("""
... S -> A B
... A -> 'a'
... # An empty production:
... B -> 'b' |
... """)
>>> list(generate(grammar))
[['a', 'b'], ['a']]
|