Usage¶
parso works around grammars. You can simply create Python grammars by calling
parso.load_grammar()
. Grammars (with a custom tokenizer and custom parser trees)
can also be created by directly instantiating parso.Grammar()
. More information
about the resulting objects can be found in the parser tree documentation.
The simplest way of using parso is without even loading a grammar
(parso.parse()
):
>>> import parso
>>> parso.parse('foo + bar')
<Module: @1-1>
Loading a Grammar¶
Typically if you want to work with one specific Python version, use:
-
parso.
load_grammar
(*, version: str = None, path: str = None)[source]¶ Loads a
parso.Grammar
. The default version is the current Python version.Parameters:
Grammar methods¶
You will get back a grammar object that you can use to parse code and find issues in it:
-
class
parso.
Grammar
(text: str, *, tokenizer, parser=<class 'parso.parser.BaseParser'>, diff_parser=None)[source]¶ parso.load_grammar()
returns instances of this class.Creating custom none-python grammars by calling this is not supported, yet.
Parameters: text – A BNF representation of your grammar. -
parse
(code: Union[str, bytes] = None, *, error_recovery=True, path: Union[os.PathLike, str] = None, start_symbol: str = None, cache=False, diff_cache=False, cache_path: Union[os.PathLike, str] = None, file_io: parso.file_io.FileIO = None) → _NodeT[source]¶ If you want to parse a Python file you want to start here, most likely.
If you need finer grained control over the parsed instance, there will be other ways to access it.
Parameters: - code (str) – A unicode or bytes string. When it’s not possible to
decode bytes to a string, returns a
UnicodeDecodeError
. - error_recovery (bool) – If enabled, any code will be returned. If it is invalid, it will be returned as an error node. If disabled, you will get a ParseError when encountering syntax errors in your code.
- start_symbol (str) – The grammar rule (nonterminal) that you want to parse. Only allowed to be used when error_recovery is False.
- path (str) – The path to the file you want to open. Only needed for caching.
- cache (bool) – Keeps a copy of the parser tree in RAM and on disk
if a path is given. Returns the cached trees if the corresponding
files on disk have not changed. Note that this stores pickle files
on your file system (e.g. for Linux in
~/.cache/parso/
). - diff_cache (bool) – Diffs the cached python module against the new code and tries to parse only the parts that have changed. Returns the same (changed) module that is found in cache. Using this option requires you to not do anything anymore with the cached modules under that path, because the contents of it might change. This option is still somewhat experimental. If you want stability, please don’t use it.
- cache_path (bool) – If given saves the parso cache in this directory. If not given, defaults to the default cache places on each platform.
Returns: A subclass of
parso.tree.NodeOrLeaf
. Typically aparso.python.tree.Module
.- code (str) – A unicode or bytes string. When it’s not possible to
decode bytes to a string, returns a
-
iter_errors
(node)[source]¶ Given a
parso.tree.NodeOrLeaf
returns a generator ofparso.normalizer.Issue
objects. For Python this is a list of syntax/indentation errors.
-
Error Retrieval¶
parso is able to find multiple errors in your source code. Iterating through those errors yields the following instances:
-
class
parso.normalizer.
Issue
(node, code, message)[source]¶ -
code
= None¶ An integer code that stands for the type of error.
-
message
= None¶ A message (string) for the issue.
-
start_pos
= None¶ The start position position of the error as a tuple (line, column). As always in parso the first line is 1 and the first column 0.
-
Utility¶
parso also offers some utility functions that can be really useful:
-
parso.
parse
(code=None, **kwargs)[source]¶ A utility function to avoid loading grammars. Params are documented in
parso.Grammar.parse()
.Parameters: version (str) – The version used by parso.load_grammar()
.
-
parso.
split_lines
(string: str, keepends: bool = False) → Sequence[str][source]¶ Intended for Python code. In contrast to Python’s
str.splitlines()
, looks at form feeds and other special characters as normal text. Just splits\n
and\r\n
. Also different: Returns[""]
for an empty string input.In Python 2.7 form feeds are used as normal characters when using str.splitlines. However in Python 3 somewhere there was a decision to split also on form feeds.
-
parso.
python_bytes_to_unicode
(source: Union[str, bytes], encoding: str = 'utf-8', errors: str = 'strict') → str[source]¶ Checks for unicode BOMs and PEP 263 encoding declarations. Then returns a unicode object like in
bytes.decode()
.Parameters: - encoding – See
bytes.decode()
documentation. - errors – See
bytes.decode()
documentation.errors
can be'strict'
,'replace'
or'ignore'
.
- encoding – See