How can this information be structured?
(a) json-like tree
(b) vector tree (hickup style)
(c) referenced graph
The terminal output can be divided into groups of:
cd
commands followed by one ls
commandcd
only ever has one argument with no spaces in betweenThe only relevant information is the directory names and the filesizes:
For approach (c), the directory names have to be turned into absolute paths, since there could be multiple directories with the same name.
A path can now act as a key for the contents of its corresponding directory:
The paths-graph can then be walked completely, summing up directory sizes along the way:
With the dir->size
map generated in part 1, the second part is trivial: