00:33:06  * rgrinbergjoined
00:43:54  * rgrinbergquit (Quit: WeeChat 1.5)
00:44:15  * rgrinbergjoined
01:55:37  * DarkGodquit (Quit: Leaving)
02:49:10  * rgrinbergquit (Ping timeout: 258 seconds)
03:03:56  * rgrinbergjoined
04:00:28  * a__quit (Ping timeout: 258 seconds)
04:03:34  * a__joined
04:06:52  * rgrinbergquit (Ping timeout: 240 seconds)
05:33:12  <rphillips>creationix: https://github.com/LuaJIT/LuaJIT/commit/c98660c8c3921e43029625e51166c9d273ad09df
05:33:28  <rphillips>hmm. someone wrote nm on that comment
05:47:12  * SkyRocknRolljoined
07:34:30  * DarkGodjoined
07:54:58  * SkyRocknRollquit (Ping timeout: 252 seconds)
08:12:11  * SkyRocknRolljoined
12:25:08  * rgrinbergjoined
12:41:05  * devurandomquit (Remote host closed the connection)
12:43:04  * devurandomjoined
12:52:17  * SkyRocknRollquit (Quit: Ex-Chat)
13:27:17  * rendarjoined
14:46:12  <creationix>rphillips, cool mraleph wants to help
14:46:25  <creationix>I wonder how to get -jdump from the agent
14:50:32  <rphillips>same guy on the mailing list... i posted the bug on their mailing list
14:51:09  <creationix>he's one of the original V8 authors/designers and knows me
14:51:20  <creationix>(mraleph that is)
14:51:49  <creationix>so the source to jdump is at https://github.com/LuaJIT/LuaJIT/blob/7e05355a08255f508d334eded96095e0bde06e2e/src/jit/dump.lua
14:52:13  <creationix>can't require it as jit.dump though
14:53:41  <creationix>If we can run the agent using luajit instead of luvi we could use the CLI arg, but I wonder how tricky that is
14:54:05  <creationix>not all the agent modules were converted to the new lit style that works with plain lua right?
14:54:56  <rphillips>true
14:55:17  <creationix>so I guess worst case is copy-paste this module into the agent and call it
14:55:31  <creationix>the modules it depends on seem to be available in luvit
14:56:20  <rphillips>i can try that shortly
15:17:25  * rgrinbergquit (Ping timeout: 244 seconds)
15:18:54  <creationix>nevermind, you can require jit.dump in luajit, it should work in luvit
15:19:45  <creationix>rphillips: so just require('jit.dump').start() at the start of the agent
15:20:02  <rphillips>k
15:21:24  <rphillips>hmm. no such package
15:21:42  <creationix>I get it sometimes
15:22:09  <creationix>nevermind, that was luajit, not luvit
15:22:18  <creationix>I wonder why luvit doesn't have it
15:28:21  <cat5e>hi
15:28:33  <cat5e>creationix, do you fiddle with require?
15:28:54  <cat5e>try using native require instead
15:29:12  <creationix>rphillips: ok, so copy luajit/src/jit/dump.lua to luvit/deps/jit-dump.lua and you can require it as `jit-dump`
15:29:23  <creationix>cat5e luvit has it's own require
15:29:33  <cat5e>creationix, yeah, use native require
15:29:37  <creationix>but if you build your own luvi or luv based app you don't have to
15:29:47  <cat5e>and don't do that copy thing
15:30:25  <cat5e>https://github.com/LuaJIT/LuaJIT/blob/7e05355a08255f508d334eded96095e0bde06e2e/src/jit/dump.lua#L59-L60
15:30:59  <creationix>cat5e yes, it depends on other modules, that works
15:32:48  <cat5e>add a thing if startswith 'jit.' then use native require
15:32:53  <cat5e>idk
15:33:03  <cat5e>.-.
15:33:10  <cat5e>I mean
15:33:20  <cat5e>wait
15:33:29  <cat5e>creationix, these aren't stored inside the luajit binary IIRC
15:33:48  <cat5e>so you do need to copy them over
15:33:49  <cat5e>nvm :/
15:34:01  <rphillips> No such module 'jit.vmdef' in 'bundle:/deps/jit-dump.lua'
15:34:13  <cat5e>but you need to keep luajit version in mind
15:34:21  <cat5e>so add the copy as part of the build process
15:34:55  <creationix>after adding jit-dump.lua to luvit's deps, this works:
15:34:56  <creationix>luvi . -- -e 'require("jit-dump").start("A") for i =1,1000 do b = i end'
15:35:04  <creationix>I get nice colored trace output
15:35:35  <cat5e>creationix, are you using standard luvi or system-luajit luvi?
15:35:53  <creationix>cat5e luvi embeds a static copy of luajit
15:36:06  <cat5e>there's a build flag to use system luajit IIRC
15:36:10  <creationix>from the luv submodule
15:36:37  <creationix>rphillips strange
15:37:25  <creationix>rphillips, so build racker/luvi with regular-asm and run rma with it?
15:39:22  <rphillips>luvit luvit -e "require('./deps/jit-dump')"
15:39:30  <rphillips>errors
15:39:33  <rphillips>brb
15:41:09  <cat5e>so, I tend to be a good debugger
15:41:46  <cat5e>but we're talking about at least 3 completely separate systems (2 local, 1 build system) and a bunch of different build flags
15:42:07  * rgrinbergjoined
15:42:27  <cat5e>creationix, can you add the build flags to luvi -v output and mark that for the next luvi release?
15:42:51  <cat5e>like how ffmpeg -v lists build flags (or am I thinking of a different program... idk)
15:43:53  <creationix>rphillips I got it tracing in the agent
15:47:41  <creationix>hmm, all tests pass with tracing on
15:48:57  <creationix>oh, I'm on an old version of virgo-agent-toolkit/luvi that doesn't have the new commits yet
15:49:39  <creationix>updating to latest luv...
15:50:33  <creationix>bingo, reproduced bug
15:53:16  <rphillips>nice
15:53:34  <creationix>now I'm trying to run just that one test to narrow it down
15:55:31  <creationix>nevermind, test-check is the first one run it seems, now to uncommit the commit in question and get a pair of dumps to compare
16:00:26  <creationix>rphillips reverting the commit in question didn't make the error go away. Now trying a clean build
16:01:22  <creationix>this is how I got the trace working. https://gist.github.com/creationix/e597b67b87bbe1a91a4ef349486c90eb
16:01:53  <creationix>after copying the dump file into the agent's deps folder
16:03:17  <creationix>hmm, still crashing with commit reverted
16:03:30  <creationix>(the bug is still reproducing)
16:05:42  <creationix>I even undid the submodule bump all the way back and it's still reproducing
16:06:15  <cat5e>that doesn't sound right...
16:09:41  <creationix>now trying master in the luvi fork, that was working for me before
16:10:11  <creationix>ok, that passes
16:16:16  <creationix>I think the bug isn't in luajit
16:16:44  <creationix>rphillips, doing a clean build of virgo/luvi passes as expected. But updating only luajit to latest v2.1 still passes
16:17:01  <creationix>so I'm thinking the bug is in luv
16:17:07  <rphillips>hmm
16:17:42  <creationix>we also updated libuv
16:17:58  <creationix>from d989902ac658b4323a4f4020446e6f4dc449e25c to 229b3a4cc150aebd6561e6bd43076eafa7a03756
16:18:30  <creationix>looks like v1.9.0 to v1.9.1
16:19:02  <creationix>trying now with just libuv and luajit updated, but luv left as-is
16:20:24  <creationix>and that breaks it
16:20:53  <creationix>I guess time to bisect libuv
16:21:22  * Haragethjoined
16:25:43  <cat5e>creationix, what happens if you just update libuv?
16:26:16  <creationix>cat5e you mean update libuv, but not luajit?
16:26:25  <cat5e>yeah
16:26:32  <creationix>I could try that later
16:32:48  <creationix>hmm, all the commits are failing in this bisect, perhaps it's not libuv after all. The bug is just not reliable
16:34:00  <creationix>bisect blames this one https://github.com/libuv/libuv/commit/4b444d3fbc4d588834b6089d401587dd0a8e85ee
16:34:16  <creationix>that's the first commit after the v1.9.0 tag that I told it was good
16:38:12  * SkyRocknRolljoined
16:40:17  <creationix>ok, updating just libuv, but not luajit doesn't reproduce the error
16:40:24  <cat5e>creationix, welp
16:40:30  <cat5e>I feel sorry for you
16:40:34  <creationix>I'm pretty sure the bug is in luv itself
16:40:55  <cat5e>creationix, how?
16:41:07  <cat5e>does luv fiddle with luajit internals?
16:41:11  <creationix>updating luajit alone didn't do it, updating libuv alone didn't
16:41:21  <creationix>but when I was testing both, luv was also updated somehow
16:41:47  <cat5e>ah
16:42:30  <creationix>now trying with old luv and new libuv and luajit
16:43:39  <creationix>all passed
16:45:44  <creationix>I'm not bisecting luv itself
16:46:12  <creationix>I'm keeping the new versions of luajit and libuv
16:46:39  <creationix>hmm, maybe not, that causes build issues
16:54:03  <creationix>rphillips: found it, sort of https://github.com/luvit/luv/commit/36a80080f46bbf614008edd957890a58aeb97f46
16:54:16  <creationix>I think the issue is much older, but this commit added stronger error checking
16:55:16  <creationix>so basically your timer's handle is NULL and has been for some time, but this commit causes it to fail the type check and crash instead of the previous undefined behavior
16:56:56  <rphillips>interesting
16:59:10  <creationix>so what to do. I don't really want to revert my check and go back to undefined behavior
17:01:39  <creationix>I guess reduce the test case and continue looking for the root cause?
17:04:35  <rphillips>perhaps... i'm looking at the commit
17:04:55  <creationix>I just asked the neovim people since they were seeing a similar issue (the reason I added the guard)
17:05:57  * Haragethquit (Remote host closed the connection)
17:06:33  * Haragethjoined
17:08:33  <creationix>hmm, actually the previous behavior isn't that undefined, it should have segfaulted
17:20:23  <creationix>nope, it's undefined, is_closing never dereferences ->data like most do
18:08:41  <creationix>so I'm pretty sure this is what's causing ->data to be NULL https://github.com/luvit/luv/blob/master/src/handle.c#L80
18:08:54  <creationix>so it sounds like we're trying to use an already closed timer
18:09:20  <creationix>I wonder if I can tweak the type check code to not fail a valid handle that's been closed
18:19:05  <creationix>rphillips, so this workaround seems to work for the failing test. I need to go back and remember why I nulled out the data property. I suspect it's because that pointer is no longer valid and we don't want dangling pointers.
18:19:06  <creationix>https://github.com/luvit/luv/pull/243
18:20:01  <rphillips>perhaps it should bit *(handle->data) = NULL
18:20:03  <rphillips>be*
18:20:05  <rphillips>?
18:20:58  <creationix>no, the code works, it NULLs out the field as designed
18:21:01  <creationix>the problem is a design issue
18:21:26  <creationix>lua still can have references to libuv types that have been closed.
18:21:33  * travis-cijoined
18:21:34  <travis-ci>luvit/luv#281 (closed-handles-typecheck - 814f91c : Tim Caswell): The build passed.
18:21:34  <travis-ci>Change view : https://github.com/luvit/luv/commit/814f91c9d89d
18:21:34  <travis-ci>Build details : https://travis-ci.org/luvit/luv/builds/148383307
18:21:34  * travis-cipart
18:21:50  <rphillips>ah gotcha
18:22:37  <creationix>and it should probably be NULLed out since it's now a dangling pointer and lua can still reference the uv_handle
18:22:53  <creationix>but so much code assumes if it's a handle, it also has a luv_handle
18:23:09  <creationix>maybe I just need to not clean out the luv_handle till it's GCed by lua
18:23:41  <creationix>handle is a uv_handle_t subclass allocated as userdata on the lua heap
18:23:51  <creationix>handle->data is a luv_handle_t allocated in C using malloc
18:24:01  <creationix>sometimes there is also handle->data->extra that's allocated in C using malloc
19:07:29  <creationix>rphillips. I think this fixes it properly. I'm going to run it through all the tests to make sure.
19:07:30  <creationix>https://github.com/luvit/luv/pull/243/files
19:08:45  <creationix>It does affect memory usage a little. Now we never free luv_handle_t instances till after the parent uv_handle_t userdata is being garbage collected
19:09:00  <creationix>so programs that leak references to the userdata will never free the luv_handles either
19:09:11  <creationix>should be a problem for well behaving programs though
19:10:21  <rphillips>creationix: i don't think you need to if statements
19:10:25  <rphillips>free can handle a NULL
19:10:31  <rphillips>s/to//
19:10:34  <creationix>true
19:11:35  <creationix>better https://github.com/luvit/luv/pull/243/commits/1238696035654a509d58de2ec6492f4fe05dae06
19:12:41  <creationix>hmm, I don't think I need to free(handle) either
19:13:02  <creationix>if that is the userdata, lua will free it for me right? But if that were true, I would get double-free errors
19:13:10  <rphillips>that is true
19:13:20  <rphillips>i think the gc frees the userdata
19:13:59  * travis-cijoined
19:14:00  <travis-ci>luvit/luv#283 (closed-handles-typecheck - dee2415 : Tim Caswell): The build has errored.
19:14:00  <travis-ci>Change view : https://github.com/luvit/luv/compare/814f91c9d89d...dee2415aa993
19:14:00  <travis-ci>Build details : https://travis-ci.org/luvit/luv/builds/148400241
19:14:00  * travis-cipart
19:16:43  <cat5e>can luvit run the lua test suite?
19:17:08  <creationix>cat5e why would it want to? I assume luajit implements lua correctly
19:17:32  <creationix>rphillips, hmm, the lots-o-timers leak test is failing
19:18:31  * travis-cijoined
19:18:32  <travis-ci>luvit/luv#284 (closed-handles-typecheck - 1238696 : Tim Caswell): The build was broken.
19:18:32  <travis-ci>Change view : https://github.com/luvit/luv/compare/dee2415aa993...123869603565
19:18:32  <travis-ci>Build details : https://travis-ci.org/luvit/luv/builds/148401969
19:18:32  * travis-cipart
19:18:48  <cat5e>creationix, lua test suite does GC etc stress-testing
19:19:02  <cat5e>idk if luajit has a similar test suite
19:19:10  <creationix>right, but for lua itself, not for my luv bindings
19:19:45  <creationix>I'm trusting the GC works, I'm interested in testing to see if my bindings leak
19:20:48  <cat5e>creationix, it can also stress-test require etc
19:22:01  <creationix>rphillips I wonder why lots-o-timers is the only leak test that fails. It just creates and then closes lots of timers
19:22:30  <rphillips>does it crash out, so we can get a backtrace?
19:25:59  <creationix>no, I think it's just going too fast to let the GC catch anything
19:26:11  <creationix>if I add a uv.run() after closing each timer, it doesn't leak
19:26:36  <creationix>since we deferred the cleanup till GC, it must not be happening till a later tick than before
19:27:46  <creationix>or I can just call collectgarbage twice to get finalizers
19:27:53  <creationix>that seems a more portable solution
19:28:23  <creationix>which explains why it passed on lua 5.2, but failed on luajit and 5.3, they have two-phase GCs
19:30:09  <cat5e>creationix, you should call it 4 times to be safe
19:30:43  <cat5e>(LJ 3.x is gonna have a 4-pass GC or something)
19:32:58  <creationix>hmm, still leaking
19:36:42  * Haragethquit (Remote host closed the connection)
19:41:59  * rgrinbergquit (Ping timeout: 250 seconds)
19:51:07  <rphillips>creationix: vidyo, my room?
19:52:55  <creationix>sure
19:52:57  <creationix>rphillips &
19:52:59  <creationix>^
20:00:20  * Haragethjoined
20:04:56  * Haragethquit (Ping timeout: 258 seconds)
20:15:35  * travis-cijoined
20:15:36  <travis-ci>luvit/luv#285 (closed-handles-typecheck - c872abf : Tim Caswell): The build is still failing.
20:15:36  <travis-ci>Change view : https://github.com/luvit/luv/compare/123869603565...c872abfb17f3
20:15:36  <travis-ci>Build details : https://travis-ci.org/luvit/luv/builds/148417143
20:15:36  * travis-cipart
20:21:10  * travis-cijoined
20:21:11  <travis-ci>luvit/luv#286 (master - 2de9229 : Tim Caswell): The build failed.
20:21:12  <travis-ci>Change view : https://github.com/luvit/luv/compare/30010c081930...2de9229517ca
20:21:13  <travis-ci>Build details : https://travis-ci.org/luvit/luv/builds/148418280
20:21:14  * travis-cipart
20:22:58  <creationix>found the reason for the close
20:23:48  <creationix>I think it's related to spawn
20:25:22  <creationix>https://github.com/luvit/luv/blob/master/src/process.c#L229-L235
20:46:47  * SkyRocknRollquit (Remote host closed the connection)
21:01:50  * Haragethjoined
21:03:19  <creationix>yay, builds are passing now
21:03:41  <creationix>not sure why we have that edge case for spawn, but I'm not digging into it today. One fix at a time
21:04:36  * travis-cijoined
21:04:37  <travis-ci>luvit/luv#287 (master - 1dfdd82 : Tim Caswell): The build was fixed.
21:04:37  <travis-ci>Change view : https://github.com/luvit/luv/compare/2de9229517ca...1dfdd82706d1
21:04:37  <travis-ci>Build details : https://travis-ci.org/luvit/luv/builds/148427706
21:04:37  * travis-cipart
21:06:23  * Haragethquit (Ping timeout: 250 seconds)
21:28:58  * Haragethjoined
21:40:30  * rgrinbergjoined
22:11:08  * Haragethquit (Remote host closed the connection)
22:12:36  * Haragethjoined
22:20:09  * rendarquit (Ping timeout: 276 seconds)
22:31:03  <Harageth>creationix did I see earlier that you might have pinned down what that bug we saw yesterday was?
22:32:26  <creationix>Harageth yep
22:32:31  <creationix>so we should do a release on Monday
22:33:17  <Harageth>ok cool. rphillips and I got through quite a bit of my documentation for deploying yesterday so hopefully that will go more smoothly!
22:33:38  <Harageth>and hopefully oncall will be as smooth as this week
23:00:41  * rgrinbergquit (Ping timeout: 244 seconds)
23:15:08  * rgrinbergjoined