Problem to solve
I have a method parse_doc
that should dynamically determine a parsing function to parse document_id
based on the value of the document_id
string:
# list of files that may have different custom parser that depends arbitrarily on name:
documents = [
"my_file.py",
"my_file.sh",
"another_file.txt",
"https://some_domain.org/some_page",
]
class Parser(object):
def parse_doc(self, document_id, *args, **kwargs):
# identify the correct parser function
parser = return_doc_parser(document_id)
parsed_doc = parser(document_id, *args, **kwargs)
return parsed_doc
Proposed solution
My thought was to define a type inference function, then dispatch the appropriate function, something like the following:
import requests
import parsers
# document type -> parser function handle map:
PARSER_DISPATCH = {
"python": parsers.py_parser,
"text": parsers.text_parser,
"url": parsers.url_parser,
}
def return_doc_type(document_id):
"""Return custom type of document_id."""
try:
response = requests.get(document_id)
if response.status_code == 200:
url_exists = True
except Exception:
url_exists = False
try:
fp = open(document_id, "r")
file_exists = True
except IOError:
file_exists = False
if file_exists and document_id.endswith(".py"):
doc_type = "python"
elif file_exists and document_id.endswith(".txt"):
doc_type = "text"
elif url_exists:
doc_type = "url"
else:
raise ValueError("no doctype identified.")
return doc_type
class Parser(object):
def parse_doc(self, document_id, *args, **kwargs):
doc_type = return_doc_type(document_id)
parser = PARSER_DISPATCH[doc_type]
parsed_doc = parser(document_id, *args, **kwargs)
return parsed_doc
A few comments: such an informal type system seems a bit unusual, and the logic for type inference seems like it could quickly grow out of hand.
My question
Is this a reasonable approach? Are there other well-known methods to do value-based dispatch? I've googled high and low and found mostly type-based dispatch (multi-methods, multiple dispatch, method overloading) discussed.