datapad.io¶
Convenience functions for creating Sequences from files and other input sources.
Functions
read_csv (path_or_paths) |
Construct a Sequence from json text files |
read_json (path_or_paths[, lines, ignore_errors]) |
Construct a Sequence from json text files |
read_text (path_or_paths[, lines]) |
Construct a Sequence from text files |
-
datapad.io.
read_csv
(path_or_paths)¶ Construct a Sequence from json text files
Parameters: path_or_paths – str or list of strings A path, or list of paths. Paths may contain glob patterns like “data/metadata-*-a.txt” >>> seq = read_csv(["data/meta-*.csv"]) >>> seq.collect() # doctest: +SKIP [["foo", "bar"], ["1", "2"], ["3", "4"]]
-
datapad.io.
read_json
(path_or_paths, lines=True, ignore_errors=False)¶ Construct a Sequence from json text files
Parameters: - path_or_paths – str or list of strings A path, or list of paths. Paths may contain glob patterns like “data/metadata-*-a.txt”
- lines – bool (default: True) If True, each element of the sequence comes from decoding a line in the json-lines text file (see: http://jsonlines.org/examples/). If False, each element in sequence is obtained by running json.loads on the entire contents of the text file .
- ignore_errors – bool If True, ignore and skip over any elements that present json load errors
>>> seq = read_json(["data/meta-*.json"], lines=True) >>> seq.collect() # doctest: +SKIP [{"dog": 1}, {"dog": 2}]
-
datapad.io.
read_text
(path_or_paths, lines=True)¶ Construct a Sequence from text files
Parameters: - path_or_paths – str or list of strings A path, or list of paths. Paths may contain glob patterns like “data/metadata-*-a.txt”
- lines – bool (default: True) If True, each element of the sequence comes from reading a line in the text file. If False, each element in sequence comes from the entire text file.
>>> seq = read_text(["data/meta-*.txt"], lines=True) >>> seq.collect() # doctest: +SKIP ["foo_a", "foo_b", "bar_a", "bar_b"]
>>> seq = read_text(["data/meta-*.txt"], lines=False) >>> seq.collect() # doctest: +SKIP ["foo_a\nfoo_b", "bar_a\nbar_b"]