Usage

parso works around grammars. You can simply create Python grammars by calling parso.load_grammar(). Grammars (with a custom tokenizer and custom parser trees) can also be created by directly instantiating parso.Grammar(). More information about the resulting objects can be found in the parser tree documentation.

The simplest way of using parso is without even loading a grammar (parso.parse()):

>>> import parso
>>> parso.parse('foo + bar')
<Module: @1-1>

Loading a Grammar

Typically if you want to work with one specific Python version, use:

parso.load_grammar(*, version: str = None, path: str = None)[source]

Loads a parso.Grammar. The default version is the current Python version.

Parameters:
  • version (str) – A python version string, e.g. version='3.8'.
  • path (str) – A path to a grammar file

Grammar methods

You will get back a grammar object that you can use to parse code and find issues in it:

class parso.Grammar(text: str, *, tokenizer, parser=<class 'parso.parser.BaseParser'>, diff_parser=None)[source]

parso.load_grammar() returns instances of this class.

Creating custom none-python grammars by calling this is not supported, yet.

Parameters:text – A BNF representation of your grammar.
parse(code: Union[str, bytes] = None, *, error_recovery=True, path: Union[os.PathLike, str] = None, start_symbol: str = None, cache=False, diff_cache=False, cache_path: Union[os.PathLike, str] = None, file_io: parso.file_io.FileIO = None) → _NodeT[source]

If you want to parse a Python file you want to start here, most likely.

If you need finer grained control over the parsed instance, there will be other ways to access it.

Parameters:
  • code (str) – A unicode or bytes string. When it’s not possible to decode bytes to a string, returns a UnicodeDecodeError.
  • error_recovery (bool) – If enabled, any code will be returned. If it is invalid, it will be returned as an error node. If disabled, you will get a ParseError when encountering syntax errors in your code.
  • start_symbol (str) – The grammar rule (nonterminal) that you want to parse. Only allowed to be used when error_recovery is False.
  • path (str) – The path to the file you want to open. Only needed for caching.
  • cache (bool) – Keeps a copy of the parser tree in RAM and on disk if a path is given. Returns the cached trees if the corresponding files on disk have not changed. Note that this stores pickle files on your file system (e.g. for Linux in ~/.cache/parso/).
  • diff_cache (bool) – Diffs the cached python module against the new code and tries to parse only the parts that have changed. Returns the same (changed) module that is found in cache. Using this option requires you to not do anything anymore with the cached modules under that path, because the contents of it might change. This option is still somewhat experimental. If you want stability, please don’t use it.
  • cache_path (bool) – If given saves the parso cache in this directory. If not given, defaults to the default cache places on each platform.
Returns:

A subclass of parso.tree.NodeOrLeaf. Typically a parso.python.tree.Module.

iter_errors(node)[source]

Given a parso.tree.NodeOrLeaf returns a generator of parso.normalizer.Issue objects. For Python this is a list of syntax/indentation errors.

refactor(base_node, node_to_str_map)[source]

Error Retrieval

parso is able to find multiple errors in your source code. Iterating through those errors yields the following instances:

class parso.normalizer.Issue(node, code, message)[source]
code = None

An integer code that stands for the type of error.

message = None

A message (string) for the issue.

start_pos = None

The start position position of the error as a tuple (line, column). As always in parso the first line is 1 and the first column 0.

Utility

parso also offers some utility functions that can be really useful:

parso.parse(code=None, **kwargs)[source]

A utility function to avoid loading grammars. Params are documented in parso.Grammar.parse().

Parameters:version (str) – The version used by parso.load_grammar().
parso.split_lines(string: str, keepends: bool = False) → Sequence[str][source]

Intended for Python code. In contrast to Python’s str.splitlines(), looks at form feeds and other special characters as normal text. Just splits \n and \r\n. Also different: Returns [""] for an empty string input.

In Python 2.7 form feeds are used as normal characters when using str.splitlines. However in Python 3 somewhere there was a decision to split also on form feeds.

parso.python_bytes_to_unicode(source: Union[str, bytes], encoding: str = 'utf-8', errors: str = 'strict') → str[source]

Checks for unicode BOMs and PEP 263 encoding declarations. Then returns a unicode object like in bytes.decode().

Parameters:
  • encoding – See bytes.decode() documentation.
  • errors – See bytes.decode() documentation. errors can be 'strict', 'replace' or 'ignore'.

Used By

  • jedi (which is used by IPython and a lot of editor plugins).
  • mutmut (mutation tester)