Testing - specify literal patterns instead of regex patterns
w.r.t. [[https://gitweb.torproject.org/stem.git/commit/?id=7711050619af1a2f8ecf4aa87f774baa5f367b3b|7711050619af1a2f8ecf4aa87f774baa5f367b3b]], I was planning to file this ticket anyways, so might as well now for the discussion.
atagar linked [StackOverflow answer] in the commit message.
(I was a bit behind on filing this ticket, but already started doing the literal re.escape()
bit in my test cases. Hence atagar's comment in the commit.)
Anyway, here's the ticket text I had started to prep - now //slightly// tweaked:
The testing codebase makes pretty extensive use of unittest.TestCase.assertRaisesRegexp()
.
An example is [[https://gitweb.torproject.org/stem.git/tree/test/integ/client/connection.py?id=0192b29a4784465e5f69f11ced584a54644e4a90#n36|here]]:
def test_no_common_link_protocol(self):
"""
Connection without a commonly accepted link protocol version.
"""
for link_protocol in (1, 2, 6, 20):
self.assertRaisesRegexp(stem.SocketError, 'Unable to establish a common link protocol with 127.0.0.1:1113', Relay.connect, '127.0.0.1', test.runner.ORPORT, [link_protocol])
The second argument is treated as a regex pattern, so it will actually match more than intended - possibly leading to some false negatives (although unlikely in this example).
The use of unittest.TestCase.assertRaisesRegexp()
without re.escape()
for a literal expression is decently common - the use of it intended for a regex is fairly rare (I haven't come across a test yet that I can recall).
Having a "literal" check is possible in (at least) two ways:
with self.assertRaises(SomeException) as cm:
do_something(some_param)
self.assertEqual(str(cm.exception), expected_literal)
self.assertRaisesRegexp(SomeException, '^%s$' % re.escape(expected_literal), do_something, some_param)
Importantly, both of these check for //exact// string.
Much of the codebase doesn't use re.escape()
, and where it does, I didn't see any ^
or $
(although I didn't search exhaustively), so that could match on substrings, also allowing for subtle bugs.
So we may consider a helper class StemTestCase(unittest.TestCase)
which adds an assertRaisesLiteral()
method, to make it easier to do this. (Or some other means of adding that in.)
We could of course take the second option with re.escape()
, but since a lot of the codebase already doesn't seem to do that, it's probably quite easy to forget, especially the '^%s$' %
part.
atagar: thoughts on these options? or leave things as-is / wontfix
?
(Filing this as a //task// ticket, as it's probably a discussion point more than anything else. I'd expect from the edge cases, there //could// be some defects, some enhancements.)
See especially [[comment:2|atagar's comment about implementation thoughts]].