00:00:01  * ircretaryquit (Remote host closed the connection)
00:00:09  * ircretaryjoined
00:11:03  * domanicjoined
00:15:30  * pfrazequit (Remote host closed the connection)
00:16:51  * pfrazejoined
00:22:18  * domanic_joined
00:22:50  * domanicquit (Ping timeout: 246 seconds)
00:40:18  * pfrazequit (Remote host closed the connection)
00:45:46  * domanic_quit (Ping timeout: 256 seconds)
00:56:53  <substack>jfhbrook: another one :/ https://github.com/jfhbrook/node-ecstatic/pull/162
00:57:07  <substack>I should add some command-line options to the test suite
01:21:34  <jjjohnny>substack: jfhbrook add this to ecstatic so no port colision at 8000? https://www.npmjs.com/package/answerver
01:32:07  * machtyquit (Ping timeout: 240 seconds)
01:32:07  * owenb_____quit (Ping timeout: 240 seconds)
01:34:27  * mappumquit (Ping timeout: 240 seconds)
01:35:19  * owenb_____joined
01:35:51  * mappumjoined
01:37:12  * machtyjoined
01:48:11  * phatedjoined
01:52:40  * phatedquit (Ping timeout: 240 seconds)
02:04:22  * eyeforeigneyejoined
02:04:33  * eyeforeigneyequit (Client Quit)
03:11:47  * stagasquit (Ping timeout: 240 seconds)
03:19:09  * juliangruberquit (Ping timeout: 244 seconds)
03:19:40  * juliangruberjoined
03:21:55  * pfrazejoined
04:06:58  * domanic_joined
04:11:36  * pfrazequit (Remote host closed the connection)
04:31:32  * domanic_quit (Ping timeout: 246 seconds)
05:37:55  * djcoinjoined
07:08:44  * fotoveritequit (Quit: fotoverite)
07:14:38  * fotoveritejoined
07:19:58  * fotoveritequit (Quit: fotoverite)
08:10:43  * peutetrejoined
08:31:53  * peutetrequit (Quit: ...)
08:55:29  * AndreasMadsenjoined
09:15:40  * stagasjoined
10:44:02  * AndreasMadsenquit (Remote host closed the connection)
10:51:26  * djcoinquit (Quit: WeeChat 1.0.1)
11:28:27  * joepie91changed nick to tumbleweeds
11:28:32  * tumbleweedschanged nick to joepie91
12:11:43  * stagasquit (Ping timeout: 272 seconds)
13:39:37  * refsetjoined
14:41:50  * contrahaxjoined
14:49:28  * contrahaxquit (Quit: Sleeping)
15:01:29  * fotoveritejoined
15:20:08  * stagasjoined
15:54:33  * ELLIOTTCABLEjoined
15:55:37  * refsetquit (Quit: Connection closed for inactivity)
16:47:25  * stagasquit (Ping timeout: 244 seconds)
16:50:37  * stagasjoined
16:54:24  * AndreasMadsenjoined
17:02:51  * stagasquit (Ping timeout: 240 seconds)
17:09:47  * pfrazejoined
17:25:54  * AndreasMadsenquit
17:31:56  * stagasjoined
17:34:32  * pfrazequit (Remote host closed the connection)
18:28:47  * dlmanningquit (Ping timeout: 240 seconds)
18:31:17  * dlmanningjoined
19:26:49  * stagasquit (Ping timeout: 264 seconds)
19:26:58  * jjjohnny_joined
19:27:54  * mollerse_joined
19:28:19  * jfhbrook_joined
19:28:41  * robertko1alskijoined
19:28:52  * hij1nxjoined
19:28:53  * ralphthe1injajoined
19:29:08  * greweb_joined
19:30:42  * timoxleyquit (Ping timeout: 260 seconds)
19:30:43  * Tristitiaquit (Ping timeout: 260 seconds)
19:30:43  * jfhbrookquit (Ping timeout: 260 seconds)
19:32:07  * timoxleyjoined
19:32:23  * paul_irishquit (Ping timeout: 240 seconds)
19:32:24  * jjjohnnyquit (Ping timeout: 240 seconds)
19:32:24  * benglquit (Ping timeout: 240 seconds)
19:32:24  * ralphtheninjaquit (Ping timeout: 240 seconds)
19:32:24  * grewebquit (Ping timeout: 240 seconds)
19:32:24  * mollersequit (Ping timeout: 240 seconds)
19:32:24  * robertkowalskiquit (Ping timeout: 240 seconds)
19:32:24  * hij1nx_quit (Ping timeout: 240 seconds)
19:32:25  * bengljoined
19:32:45  * paul_irish_joined
19:32:55  * benglchanged nick to Guest80443
19:33:44  * Tristit1ajoined
19:40:02  * mollerse_changed nick to mollerse
19:47:35  * Tristit1achanged nick to Tristitia
20:17:54  * robertko1alskichanged nick to robertkowalski
20:25:03  * pfrazejoined
20:30:23  * ralphthe1injachanged nick to ralphtheninja
20:31:06  * pfrazequit (Remote host closed the connection)
20:44:52  * phatedjoined
21:03:21  * jfhbrook_changed nick to jfhbrook
21:03:55  * phatedquit (Remote host closed the connection)
21:46:23  <jjjohnny_>WHOA THERE BUDDY
22:16:32  * pfrazejoined
22:25:23  * contrahaxjoined
22:25:29  * stagasjoined
22:39:21  <ogd>https://www.irccloud.com/pastebin/Pz6u06MJ/
22:39:34  <ogd>mikolalysenko: mafintosh and i are trying to figure out an approach to o/
22:40:23  <ogd>mikolalysenko: i should mention also: the graph is append-only and stored on disk so we are trying to figure out a path indexing scheme we can store on disk to make these lookups fast
22:41:56  <mikolalysenko>ogd: what operations does your dag need to support?
22:42:47  <ogd>mikolalysenko: .put([array of parent nodes, 1 parent node, or 0 parent nodes for a new head], buffer)
22:43:30  <ogd>mikolalysenko: and then this proposed .getPathSegments(node) that does whats in the gist
22:43:41  <mikolalysenko>so: create node, and link node are your two update operations
22:43:51  <mikolalysenko>do you ever have to do cut an edge?
22:44:06  <ogd>mikolalysenko: whats that mean?
22:44:27  <mikolalysenko>like a -> b at time 0, then later on you cut the graph so a and b are not connected directly
22:44:43  <mafintosh>mikolalysenko: no the graph is append only
22:45:53  <mikolalysenko>ok, so this is an incremental problem
22:45:58  <mafintosh>mikolalysenko: yes
22:46:16  <ogd>mikolalysenko: yea we wanna avoid traversals on get/put
22:46:23  <mikolalysenko>right
22:46:23  * freeman-labjoined
22:47:50  <mikolalysenko>so one screwy way to do this is you could use demaine's functional link cut tree data structure
22:47:53  <mikolalysenko>but that has some drawbacks
22:48:01  <mafintosh>mikolalysenko: we need a way to find all the paths from a given node to any root reachable from that node (and we want to have the paths deduplicated if that makes sense)
22:48:10  <mikolalysenko>like you can only have a finite number of iterators into the tree at any given moment
22:48:55  <mikolalysenko>mafintosh: there could be an exponential number of such paths though
22:50:17  <mikolalysenko>think about a sequence of diamonds:
22:50:17  <mikolalysenko>/\
22:50:17  <mikolalysenko>\/
22:50:17  <mikolalysenko>/\
22:50:17  <mikolalysenko>\/
22:50:18  <mikolalysenko>...
22:50:19  <mafintosh>mikolalysenko: i should correct my self - we only needs as many paths needed to cover all nodes in the subtree
22:51:21  <mikolalysenko>mafintosh: why not just report the whole subgraph?
22:52:14  <mikolalysenko>you could do that in linear time in the size of the result by just doing a boring old bfs traversal
22:53:33  <mafintosh>mikolalysenko: yes but is there a good way to index that?
22:53:39  * contrahaxquit (Ping timeout: 250 seconds)
22:53:44  <mikolalysenko>that is basically optimal
22:53:57  <mikolalysenko>if you have an output of size O(n), you can't process it faster than O(n)
22:54:09  <mikolalysenko>unless you return references or evaluate it lazily
22:54:18  <mikolalysenko>then you can amortize a bit by streaming, but that is it
22:54:30  <mafintosh>mikolalysenko: but can we index that on insert time somehow?
22:54:43  <mikolalysenko>wouldn't matter
22:54:54  <mikolalysenko>even if you indexed it, just reading out the index would take as long
22:55:37  <mafintosh>i'm not sure i follow :)
22:55:58  <mikolalysenko>I can write some psuedo code
22:56:09  <mikolalysenko>but before I do that, can you answer me a question:
22:56:15  <mikolalysenko>what do you do in the diamond path case?
22:56:29  <mafintosh>mikolalysenko: yes 2 sec - let me gist it
22:58:23  <mafintosh>mikolalysenko: https://gist.github.com/mafintosh/5357d2c64d4bb918878f
22:58:32  * stagasquit (Read error: Connection reset by peer)
22:58:44  * stagasjoined
22:58:47  <mikolalysenko>mafintosh: but fc isn't a path
22:59:13  <mikolalysenko>you could do acdfg maybe
22:59:39  <mafintosh>mikolalysenko: thats what i meant by "deduplicated" path
22:59:57  <mafintosh>mikolalysenko: i'm not sure i know the right terminology
23:00:12  <mikolalysenko>hmm
23:00:48  <mikolalysenko>what about this: maybe you want to decompose the dag rooted at a into a collection of paths where the length of each path is as long as possible
23:00:59  <mafintosh>mikolalysenko: so in that graph the two paths, abdeg and acdfg would cover all nodes
23:01:22  <ogd>mikolalysenko: yea that sounds accurate
23:01:30  <mafintosh>mikolalysenko: but i only writes nodes once to a path
23:01:34  * contrahaxjoined
23:01:41  * stagasquit (Client Quit)
23:01:50  <ogd>mikolalysenko: we want to make the graph into a series of paths, but we wont need to be able to do graph queries against the output
23:01:55  * stagasjoined
23:01:56  <mikolalysenko>well, why bother with paths then. why not just write all the nodes?
23:01:58  <ogd>s/make/decompose
23:03:21  <mikolalysenko>or how about this: maybe you want to cover the dag rooted at a with a tree?
23:03:25  * stagasquit (Read error: Connection reset by peer)
23:03:32  <mikolalysenko>and then take that tree and compute a heavy-light decomposition
23:03:35  * stagasjoined
23:04:24  * phatedjoined
23:04:56  <ogd>mikolalysenko: ah this is cool https://en.wikipedia.org/wiki/Heavy_path_decomposition
23:05:14  <mikolalysenko>yeah, but it only works for trees
23:05:33  <mikolalysenko>though you could tree-ify this dag and then run heavy-light on the result
23:05:54  <mikolalysenko>(or at least I've never heard of a way to do heavy-light decomposition on a dag)
23:06:15  <ogd>treeifying would just involve denormalizing (for lack of a better word) all subtrees?
23:06:15  <mafintosh>mikolalysenko: i wanna get the "paths" to be able to perform a binary search on each path against another dag to figure out which nodes they share
23:06:19  <ogd>subgraphs*
23:06:54  <mafintosh>mikolalysenko: because our dag is a merkle dag we know that if you have a node on a path you'll have everything before that node as well
23:07:22  <mafintosh>mikolalysenko: so fx in the diamond gist graph if you have d we know you have abc also
23:07:37  <ogd>the binary search is to find the latest common ancestor node
23:07:44  <ogd>latest/most recent
23:08:49  <mikolalysenko>ok, so you want to diff two dags quickly?
23:09:05  <mafintosh>mikolalysenko: thats what we end up doing
23:09:05  * phatedquit (Ping timeout: 246 seconds)
23:09:18  <mafintosh>mikolalysenko: after we have the paths
23:11:08  <mikolalysenko>hmm
23:11:15  <mikolalysenko>I don't quite get why they need to be paths
23:11:20  <ogd>mikolalysenko: the network api between the two graphs is: do you have these nodes [array of nodes]. other side responds by telling you which of the nodes it has
23:13:11  <ogd>mikolalysenko: the paths just give us an order we can use in our binary search to find the common ancestor
23:13:21  <ogd>mikolalysenko: but if the nodes were ordered in some other way that would work too i guess
23:13:33  <mafintosh>mikolalysenko: so we can do it in log(n) network roundtrips
23:13:54  <mikolalysenko>hmm
23:14:00  <mikolalysenko>how does lca help here?
23:14:12  <mikolalysenko>or what is the lca computing?
23:14:32  <ogd>mikolalysenko: then we know everything thats in common between the graphs since its a merkle graph and all nodes older than lca are already in sync
23:14:45  <ogd>mikolalysenko: then we can just request the data for all nodes after that point
23:14:58  <mikolalysenko>ok
23:19:08  <mikolalysenko>so I've been reading about communication complexity lately, but it seems to me that this problem is at least as hard as solving disjointness
23:19:20  <mikolalysenko>or basically you are computing the set difference of two sets of nodes
23:19:45  <ogd>brb this cafe is closing
23:19:49  <mikolalysenko>it doesn't seem like the fact that the nodes are in a dag helps much, since at least in theory you could get a bunch of nodes with a common parent
23:20:05  * stagasquit (Ping timeout: 250 seconds)
23:20:22  <mikolalysenko>I'm not sure what the multiround complexity of disjointness is, but I do know that in 1 round the best you can do is transmit the whole set or all O(n) nodes
23:22:08  <mikolalysenko>and I think this also holds for multiple rounds, even probabilistically
23:24:27  <mikolalysenko>what is less clear to me is how you can do it in less than O(n) bits, even using multiple rounds
23:34:32  <ogd>mikolalysenko: ok back on wifi
23:35:49  <mikolalysenko>so, I guess what I am saying first is that I don't get how it being a dag helps much. what if you have one node with a ton of children?
23:36:08  <ogd>mikolalysenko: our use case is kinda like the linux kernel git repo
23:36:45  <ogd>mikolalysenko: we have a bunch of nodes in a dag, at any time you might create new branches in the dag, when you push or pull with someone we want to establish which parts of the dag are already replicated so we can skip them and only send the new stuff
23:37:08  <ogd>mikolalysenko: its not very common that someone would create a million new branches, but they might have a million nodes in 1 new branch
23:37:53  * contrahaxquit (Quit: Sleeping)
23:40:27  <mikolalysenko>how does git do this?
23:40:49  <mikolalysenko>do you just send a timestamp or something, and then the server sends you everything after that time?
23:41:05  <ogd>mikolalysenko: mafintosh' current approach is to store a bunch of what he calls 'deduplicated paths' which means taking a dag and representing it as a bunch of paths, where each vertex is only a member of a single path. that way we can binary search each path to figure out if theres a point in that path that is already in sync. once we find the last node in
23:41:05  <ogd>a path that exists on the other side we know all nodes before that path are in sync. then we can request the data for the nodes that are missing
23:47:42  <mikolalysenko>hmm
23:48:02  <mikolalysenko>I think I could construct the paths, but I'm still not quite sure how this speeds things up
23:50:01  <mikolalysenko>as a sketch of how you could build this, you could just keep iteratively removing the longest path from the dag until you are done
23:50:08  <mikolalysenko>and because it is a dag, you can do that in linear time
23:50:47  <mafintosh>mikolalysenko: yea thats basically what we're doing now
23:52:06  <ogd>mikolalysenko: our problem is basically: given two graphs that no nothing about each other, how do you figure out what nodes in graph A are missing from graph B. minimizing for number of round trips, and allowing for new nodes to be appended without expensive traversals on each insert
23:52:15  <ogd>know nothing*
23:52:59  <ogd>mikolalysenko: so the 'deduplicated path' thing is just a construct that gives us a range to binary search against
23:53:16  <ogd>mikolalysenko: so that we know by searching all paths taht we cover 100% of the graph
23:58:31  <ogd>mikolalysenko: a more trivial implementation to solve the same problem would be to start by asking the remote graph for all its latest nodes (heads), including the list of parent nodes for that node. you check if you have those nodes and their parents, for any node that is missing you repeat until you have all nodes. this works but is a lot of round trips
23:59:19  <ogd>(by 'parent nodes' above i mean direct parents, not all ancestors)