Parsing docstrings¶
This module contains functions and classes that parse docstrings.
AUTHORS:
David Roe (2012-03-27) – initial version, based on Robert Bradshaw’s code.
Jeroen Demeyer(2014-08-28) – much improved handling of tolerances using interval arithmetic (trac ticket #16889).
-
class
sage.doctest.parsing.
MarkedOutput
¶ Bases:
str
A subclass of string with context for whether another string matches it.
EXAMPLES:
sage: from sage.doctest.parsing import MarkedOutput sage: s = MarkedOutput("abc") sage: s.rel_tol 0 sage: s.update(rel_tol = .05) u'abc' sage: s.rel_tol 0.0500000000000000 sage: MarkedOutput(u"56 µs") u'56 µs'
-
update
(**kwds)¶ EXAMPLES:
sage: from sage.doctest.parsing import MarkedOutput sage: s = MarkedOutput("0.0007401") sage: s.update(abs_tol = .0000001) u'0.0007401' sage: s.rel_tol 0 sage: s.abs_tol 1.00000000000000e-7
-
-
class
sage.doctest.parsing.
OriginalSource
(example)¶ Bases:
object
Context swapping out the pre-parsed source with the original for better reporting.
EXAMPLES:
sage: from sage.doctest.sources import FileDocTestSource sage: from sage.doctest.control import DocTestDefaults sage: from sage.env import SAGE_SRC sage: import os sage: filename = os.path.join(SAGE_SRC,'sage','doctest','forker.py') sage: FDS = FileDocTestSource(filename,DocTestDefaults()) sage: doctests, extras = FDS.create_doctests(globals()) sage: ex = doctests[0].examples[0] sage: ex.sage_source u'doctest_var = 42; doctest_var^2\n' sage: ex.source u'doctest_var = Integer(42); doctest_var**Integer(2)\n' sage: from sage.doctest.parsing import OriginalSource sage: with OriginalSource(ex): ....: ex.source u'doctest_var = 42; doctest_var^2\n'
-
sage.doctest.parsing.
RIFtol
(*args)¶ Create an element of the real interval field used for doctest tolerances.
It allows large numbers like 1e1000, it parses strings with spaces like
RIF(" - 1 ")
out of the box and it carries a lot of precision. The latter is useful for testing libraries using arbitrary precision but not guaranteed rounding such as PARI. We use 1044 bits of precision, which should be good to deal with tolerances on numbers computed with 1024 bits of precision.The interval approach also means that we do not need to worry about rounding errors and it is also very natural to see a number with tolerance as an interval.
EXAMPLES:
sage: from sage.doctest.parsing import RIFtol sage: RIFtol(-1, 1) 0.? sage: RIFtol(" - 1 ") -1 sage: RIFtol("1e1000") 1.00000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000?e1000
-
class
sage.doctest.parsing.
SageDocTestParser
(optional_tags=(), long=False)¶ Bases:
doctest.DocTestParser
A version of the standard doctest parser which handles Sage’s custom options and tolerances in floating point arithmetic.
-
parse
(string, *args)¶ A Sage specialization of
doctest.DocTestParser
.INPUT:
string
– the string to parse.name
– optional string giving the name identifying string, to be used in error messages.
OUTPUT:
A list consisting of strings and
doctest.Example
instances. There will be at least one string between successive examples (exactly one unless or long or optional tests are removed), and it will begin and end with a string.
EXAMPLES:
sage: from sage.doctest.parsing import SageDocTestParser sage: DTP = SageDocTestParser(('sage','magma','guava')) sage: example = 'Explanatory text::\n\n sage: E = magma("EllipticCurve([1, 1, 1, -10, -10])") # optional: magma\n\nLater text' sage: parsed = DTP.parse(example) sage: parsed[0] 'Explanatory text::\n\n' sage: parsed[1].sage_source 'E = magma("EllipticCurve([1, 1, 1, -10, -10])") # optional: magma\n' sage: parsed[2] '\nLater text'
If the doctest parser is not created to accept a given optional argument, the corresponding examples will just be removed:
sage: DTP2 = SageDocTestParser(('sage',)) sage: parsed2 = DTP2.parse(example) sage: parsed2 ['Explanatory text::\n\n', '\nLater text']
You can mark doctests as having a particular tolerance:
sage: example2 = 'sage: gamma(1.6) # tol 2.0e-11\n0.893515349287690' sage: ex = DTP.parse(example2)[1] sage: ex.sage_source 'gamma(1.6) # tol 2.0e-11\n' sage: ex.want u'0.893515349287690\n' sage: type(ex.want) <class 'sage.doctest.parsing.MarkedOutput'> sage: ex.want.tol 2.000000000000000000...?e-11
You can use continuation lines:
sage: s = "sage: for i in range(4):\n....: print(i)\n....:\n" sage: ex = DTP2.parse(s)[1] sage: ex.source 'for i in range(Integer(4)):\n print(i)\n'
Sage currently accepts backslashes as indicating that the end of the current line should be joined to the next line. This feature allows for breaking large integers over multiple lines but is not standard for Python doctesting. It’s not guaranteed to persist, but works in Sage 5.5:
sage: n = 1234\ ....: 5678 sage: print(n) 12345678 sage: type(n) <class 'sage.rings.integer.Integer'>
It also works without the line continuation:
sage: m = 8765\ 4321 sage: print(m) 87654321
Test that trac ticket #26575 is resolved:
sage: example3 = 'sage: Zp(5,4,print_mode="digits")(5)\n...00010' sage: parsed3 = DTP.parse(example3) sage: dte = parsed3[1] sage: dte.sage_source 'Zp(5,4,print_mode="digits")(5)\n' sage: dte.want '...00010\n'
-
-
class
sage.doctest.parsing.
SageOutputChecker
¶ Bases:
doctest.OutputChecker
A modification of the doctest OutputChecker that can check relative and absolute tolerance of answers.
EXAMPLES:
sage: from sage.doctest.parsing import SageOutputChecker, MarkedOutput, SageDocTestParser sage: import doctest sage: optflag = doctest.NORMALIZE_WHITESPACE|doctest.ELLIPSIS sage: DTP = SageDocTestParser(('sage','magma','guava')) sage: OC = SageOutputChecker() sage: example2 = 'sage: gamma(1.6) # tol 2.0e-11\n0.893515349287690' sage: ex = DTP.parse(example2)[1] sage: ex.sage_source 'gamma(1.6) # tol 2.0e-11\n' sage: ex.want u'0.893515349287690\n' sage: type(ex.want) <class 'sage.doctest.parsing.MarkedOutput'> sage: ex.want.tol 2.000000000000000000...?e-11 sage: OC.check_output(ex.want, '0.893515349287690', optflag) True sage: OC.check_output(ex.want, '0.8935153492877', optflag) True sage: OC.check_output(ex.want, '0', optflag) False sage: OC.check_output(ex.want, 'x + 0.8935153492877', optflag) False
-
add_tolerance
(wantval, want)¶ Enlarge the real interval element
wantval
according to the tolerance options inwant
.INPUT:
wantval
– a real interval elementwant
– aMarkedOutput
describing the tolerance
OUTPUT:
an interval element containing
wantval
EXAMPLES:
sage: from sage.doctest.parsing import MarkedOutput, SageOutputChecker sage: OC = SageOutputChecker() sage: want_tol = MarkedOutput().update(tol=0.0001) sage: want_abs = MarkedOutput().update(abs_tol=0.0001) sage: want_rel = MarkedOutput().update(rel_tol=0.0001) sage: OC.add_tolerance(RIF(pi.n(64)), want_tol).endpoints() (3.14127849432443, 3.14190681285516) sage: OC.add_tolerance(RIF(pi.n(64)), want_abs).endpoints() (3.14149265358979, 3.14169265358980) sage: OC.add_tolerance(RIF(pi.n(64)), want_rel).endpoints() (3.14127849432443, 3.14190681285516) sage: OC.add_tolerance(RIF(1e1000), want_tol) 1.000?e1000 sage: OC.add_tolerance(RIF(1e1000), want_abs) 1.000000000000000?e1000 sage: OC.add_tolerance(RIF(1e1000), want_rel) 1.000?e1000 sage: OC.add_tolerance(0, want_tol) 0.000? sage: OC.add_tolerance(0, want_abs) 0.000? sage: OC.add_tolerance(0, want_rel) 0
-
check_output
(want, got, optionflags)¶ Checks to see if the output matches the desired output.
If
want
is aMarkedOutput
instance, takes into account the desired tolerance.INPUT:
want
– a string orMarkedOutput
got
– a stringoptionflags
– an integer, passed down todoctest.OutputChecker
OUTPUT:
boolean, whether
got
matcheswant
up to the specified tolerance.
EXAMPLES:
sage: from sage.doctest.parsing import MarkedOutput, SageOutputChecker sage: import doctest sage: optflag = doctest.NORMALIZE_WHITESPACE|doctest.ELLIPSIS sage: rndstr = MarkedOutput("I'm wrong!").update(random=True) sage: tentol = MarkedOutput("10.0").update(tol=.1) sage: tenabs = MarkedOutput("10.0").update(abs_tol=.1) sage: tenrel = MarkedOutput("10.0").update(rel_tol=.1) sage: zerotol = MarkedOutput("0.0").update(tol=.1) sage: zeroabs = MarkedOutput("0.0").update(abs_tol=.1) sage: zerorel = MarkedOutput("0.0").update(rel_tol=.1) sage: zero = "0.0" sage: nf = "9.5" sage: ten = "10.05" sage: eps = "-0.05" sage: OC = SageOutputChecker()
sage: OC.check_output(rndstr,nf,optflag) True sage: OC.check_output(tentol,nf,optflag) True sage: OC.check_output(tentol,ten,optflag) True sage: OC.check_output(tentol,zero,optflag) False sage: OC.check_output(tenabs,nf,optflag) False sage: OC.check_output(tenabs,ten,optflag) True sage: OC.check_output(tenabs,zero,optflag) False sage: OC.check_output(tenrel,nf,optflag) True sage: OC.check_output(tenrel,ten,optflag) True sage: OC.check_output(tenrel,zero,optflag) False sage: OC.check_output(zerotol,zero,optflag) True sage: OC.check_output(zerotol,eps,optflag) True sage: OC.check_output(zerotol,ten,optflag) False sage: OC.check_output(zeroabs,zero,optflag) True sage: OC.check_output(zeroabs,eps,optflag) True sage: OC.check_output(zeroabs,ten,optflag) False sage: OC.check_output(zerorel,zero,optflag) True sage: OC.check_output(zerorel,eps,optflag) False sage: OC.check_output(zerorel,ten,optflag) False
More explicit tolerance checks:
sage: _ = x # rel tol 1e10 sage: raise RuntimeError # rel tol 1e10 Traceback (most recent call last): ... RuntimeError sage: 1 # abs tol 2 -0.5 sage: print("0.9999") # rel tol 1e-4 1.0 sage: print("1.00001") # abs tol 1e-5 1.0 sage: 0 # rel tol 1 1
Spaces before numbers or between the sign and number are ignored:
sage: print("[ - 1, 2]") # abs tol 1e-10 [-1,2]
Tolerance on Python 3 for string results with unicode prefix:
sage: a = u'Cyrano'; a u'Cyrano' sage: b = [u'Fermat', u'Euler']; b [u'Fermat', u'Euler'] sage: c = u'you'; c u'you'
Also allowance for the difference in reprs of
type
instances (i.e. classes) between Python 2 and Python 3:sage: int <class 'int'> sage: float <class 'float'>
-
human_readable_escape_sequences
(string)¶ Make ANSI escape sequences human readable.
EXAMPLES:
sage: print('This is \x1b[1mbold\x1b[0m text') This is <CSI-1m>bold<CSI-0m> text
-
output_difference
(example, got, optionflags)¶ Report on the differences between the desired result and what was actually obtained.
If
want
is aMarkedOutput
instance, takes into account the desired tolerance.INPUT:
example
– adoctest.Example
instancegot
– a stringoptionflags
– an integer, passed down todoctest.OutputChecker
OUTPUT:
a string, describing how
got
fails to matchexample.want
EXAMPLES:
sage: from sage.doctest.parsing import MarkedOutput, SageOutputChecker sage: import doctest sage: optflag = doctest.NORMALIZE_WHITESPACE|doctest.ELLIPSIS sage: tentol = doctest.Example('',MarkedOutput("10.0\n").update(tol=.1)) sage: tenabs = doctest.Example('',MarkedOutput("10.0\n").update(abs_tol=.1)) sage: tenrel = doctest.Example('',MarkedOutput("10.0\n").update(rel_tol=.1)) sage: zerotol = doctest.Example('',MarkedOutput("0.0\n").update(tol=.1)) sage: zeroabs = doctest.Example('',MarkedOutput("0.0\n").update(abs_tol=.1)) sage: zerorel = doctest.Example('',MarkedOutput("0.0\n").update(rel_tol=.1)) sage: tlist = doctest.Example('',MarkedOutput("[10.0, 10.0, 10.0, 10.0, 10.0, 10.0]\n").update(abs_tol=0.987)) sage: zero = "0.0" sage: nf = "9.5" sage: ten = "10.05" sage: eps = "-0.05" sage: L = "[9.9, 8.7, 10.3, 11.2, 10.8, 10.0]" sage: OC = SageOutputChecker()
sage: print(OC.output_difference(tenabs,nf,optflag)) Expected: 10.0 Got: 9.5 Tolerance exceeded: 10.0 vs 9.5, tolerance 5e-1 > 1e-1 sage: print(OC.output_difference(tentol,zero,optflag)) Expected: 10.0 Got: 0.0 Tolerance exceeded: 10.0 vs 0.0, tolerance 1e0 > 1e-1 sage: print(OC.output_difference(tentol,eps,optflag)) Expected: 10.0 Got: -0.05 Tolerance exceeded: 10.0 vs -0.05, tolerance 2e0 > 1e-1 sage: print(OC.output_difference(tlist,L,optflag)) Expected: [10.0, 10.0, 10.0, 10.0, 10.0, 10.0] Got: [9.9, 8.7, 10.3, 11.2, 10.8, 10.0] Tolerance exceeded in 2 of 6: 10.0 vs 8.7, tolerance 2e0 > 9.87e-1 10.0 vs 11.2, tolerance 2e0 > 9.87e-1
-
-
sage.doctest.parsing.
get_source
(example)¶ Return the source with the leading ‘sage: ‘ stripped off.
EXAMPLES:
sage: from sage.doctest.parsing import get_source sage: from sage.doctest.sources import DictAsObject sage: example = DictAsObject({}) sage: example.sage_source = "2 + 2" sage: example.source = "sage: 2 + 2" sage: get_source(example) '2 + 2' sage: example = DictAsObject({}) sage: example.source = "3 + 3" sage: get_source(example) '3 + 3'
-
sage.doctest.parsing.
make_marked_output
(s, D)¶ Auxiliary function for pickling.
EXAMPLES:
sage: from sage.doctest.parsing import make_marked_output sage: s = make_marked_output("0.0007401", {'abs_tol':.0000001}) sage: s u'0.0007401' sage: s.abs_tol 1.00000000000000e-7
-
sage.doctest.parsing.
normalize_bound_method_repr
(s)¶ Normalize differences between Python 2 and 3 in how bound methods are represented.
On Python 2 bound methods are represented using the class name of the object the method was bound to, whereas on Python 3 they are represented with the fully-qualified name of the function that implements the method.
In the context of a doctest it’s almost impossible to convert accurately from the latter to the former or vice-versa, so we simplify the reprs of bound methods to just the bare method name.
This is slightly regressive since it means one can’t use the repr of a bound method to test whether some element is getting a method from the correct class (important sometimes in the cases of dynamic classes). However, such tests could be written could be written more explicitly to emphasize that they are testing such behavior.
EXAMPLES:
sage: from sage.doctest.parsing import normalize_bound_method_repr sage: el = Semigroups().example().an_element() sage: el 42 sage: el.is_idempotent <bound method ....is_idempotent of 42> sage: normalize_bound_method_repr(repr(el.is_idempotent)) '<bound method is_idempotent of 42>'
An example where the object
repr
contains whitespace:sage: U = DisjointUnionEnumeratedSets( ....: Family([1, 2, 3], Partitions), facade=False) sage: U._element_constructor_ <bound method ...._element_constructor_default of Disjoint union of Finite family {...}> sage: normalize_bound_method_repr(repr(U._element_constructor_)) '<bound method _element_constructor_default of Disjoint union of Finite family {...}>'
-
sage.doctest.parsing.
normalize_long_repr
(s)¶ Simple conversion from Python 2 representation of
long
ints (that is, integers with theL
) suffix, to the Python 3 representation (same number, without the suffix, since Python 3 doesn’t have a distinctlong
type).Note: This just uses a simple regular expression that can’t distinguish representations of long objects from strings containing a long repr.
EXAMPLES:
sage: from sage.doctest.parsing import normalize_long_repr sage: normalize_long_repr('10L') '10' sage: normalize_long_repr('[10L, -10L, +10L, "ALL"]') '[10, -10, +10, "ALL"]'
-
sage.doctest.parsing.
normalize_type_repr
(s)¶ Convert the repr of type objects (e.g.
int
,float
) from their Python 2 representation to their Python 3 representation.In Python 2, the repr of built-in types like
int
is like<type 'int'>
, whereas user-defined pure Python classes are displayed as<class 'classname'>
. On Python 3 this was normalized so that built-in types are represented the same as user-defined classes (e.g.<class 'int'>
.This simply normalizes all class/type reprs to the Python 3 convention for the sake of output checking.
EXAMPLES:
sage: from sage.doctest.parsing import normalize_type_repr sage: s = "<type 'int'>" sage: normalize_type_repr(s) "<class 'int'>" sage: normalize_type_repr(repr(float)) "<class 'float'>"
This can work on multi-line output as well:
sage: s = "The desired output was <class 'int'>\n" sage: s += "The received output was <type 'int'>" sage: print(normalize_type_repr(s)) The desired output was <class 'int'> The received output was <class 'int'>
And should work when types are embedded in other nested expressions:
sage: normalize_type_repr(repr([Integer, float])) "[<class 'sage.rings.integer.Integer'>, <class 'float'>]"
Return a set consisting of the optional tags from the following set that occur in a comment on the first line of the input string.
‘long time’
‘not implemented’
‘not tested’
‘known bug’
‘py2’
‘py3’
‘arb216’
‘arb218’
‘optional: PKG_NAME’ – the set will just contain ‘PKG_NAME’
EXAMPLES:
sage: from sage.doctest.parsing import parse_optional_tags sage: parse_optional_tags("sage: magma('2 + 2')# optional: magma") {'magma'} sage: parse_optional_tags("sage: #optional -- mypkg") {'mypkg'} sage: parse_optional_tags("sage: print(1) # parentheses are optional here") set() sage: parse_optional_tags("sage: print(1) # optional") {''} sage: sorted(list(parse_optional_tags("sage: #optional -- foo bar, baz"))) ['bar', 'foo'] sage: sorted(list(parse_optional_tags(" sage: factor(10^(10^10) + 1) # LoNg TiME, NoT TeSTED; OptioNAL -- P4cka9e"))) ['long time', 'not tested', 'p4cka9e'] sage: parse_optional_tags(" sage: raise RuntimeError # known bug") {'bug'} sage: sorted(list(parse_optional_tags(" sage: determine_meaning_of_life() # long time, not implemented"))) ['long time', 'not implemented']
We don’t parse inside strings:
sage: parse_optional_tags(" sage: print(' # long time')") set() sage: parse_optional_tags(" sage: print(' # long time') # not tested") {'not tested'}
UTF-8 works:
sage: parse_optional_tags("'ěščřžýáíéďĎ'") set()
-
sage.doctest.parsing.
parse_tolerance
(source, want)¶ Return a version of
want
marked up with the tolerance tags specified insource
.INPUT:
source
– a string, the source of a doctestwant
– a string, the desired output of the doctest
OUTPUT:
want
if there are no tolerance tags specified; aMarkedOutput
version otherwise.
EXAMPLES:
sage: from sage.doctest.parsing import parse_tolerance sage: marked = parse_tolerance("sage: s.update(abs_tol = .0000001)", "") sage: type(marked) <... 'str'> sage: marked = parse_tolerance("sage: s.update(tol = 0.1); s.rel_tol # abs tol 0.01 ", "") sage: marked.tol 0 sage: marked.rel_tol 0 sage: marked.abs_tol 0.010000000000000000000...?
-
sage.doctest.parsing.
pre_hash
(s)¶ Prepends a string with its length.
EXAMPLES:
sage: from sage.doctest.parsing import pre_hash sage: pre_hash("abc") '3:abc'
-
sage.doctest.parsing.
reduce_hex
(fingerprints)¶ Return a symmetric function of the arguments as hex strings.
The arguments should be 32 character strings consisting of hex digits: 0-9 and a-f.
EXAMPLES:
sage: from sage.doctest.parsing import reduce_hex sage: reduce_hex(["abc", "12399aedf"]) '0000000000000000000000012399a463' sage: reduce_hex(["12399aedf","abc"]) '0000000000000000000000012399a463'
-
sage.doctest.parsing.
remove_unicode_u
(string)¶ Given a string, try to remove all unicode u prefixes inside.
This will help to keep the same doctest results in Python2 and Python3. The input string is typically the documentation of a method or function. This string may contain some letters u that are unicode python2 prefixes. The aim is to remove all of these u and only them.
INPUT:
string
– eitherunicode
orbytes
(ifbytes
, it will be converted tounicode
assuming UTF-8)
OUTPUT:
unicode
stringEXAMPLES:
sage: from sage.doctest.parsing import remove_unicode_u as remu sage: remu("u'you'") u"'you'" sage: remu('u') u'u' sage: remu("[u'am', 'stram', u'gram']") u"['am', 'stram', 'gram']" sage: remu('[u"am", "stram", u"gram"]') u'["am", "stram", "gram"]'
This deals correctly with nested quotes:
sage: str = '''[u"Singular's stuff", u'good']''' sage: print(remu(str)) ["Singular's stuff", 'good']