00:00:00  * ircretaryquit (Remote host closed the connection)
00:00:08  * ircretaryjoined
00:07:56  * st_lukequit (Read error: Connection reset by peer)
00:32:13  * loladiroquit (Quit: loladiro)
00:48:24  * amartensjoined
01:02:53  * c4milojoined
01:15:13  * c4miloquit (Remote host closed the connection)
01:16:29  * loladirojoined
01:16:59  * dsantiagoquit (Quit: Computer has gone to sleep.)
01:18:10  * dsantiagojoined
01:24:24  * amartensquit (Quit: Leaving.)
01:37:15  * EhevuTovquit (Quit: This computer has gone to sleep)
01:40:39  * amartensjoined
01:43:20  * abraxasjoined
01:51:49  * defunctzombie_zzchanged nick to defunctzombie
01:59:22  * loladiroquit (Quit: loladiro)
02:11:26  * loladirojoined
02:24:46  * EhevuTovjoined
02:38:21  <trevnorris>othiym23: only place where async listeners will not capture is crypto.pbkdf2() and process.exit()
02:38:30  <trevnorris>and coolio. i'll take a look.
03:12:55  * defunctzombiechanged nick to defunctzombie_zz
03:17:05  * st_lukejoined
03:17:19  * st_lukequit (Remote host closed the connection)
03:19:37  * inolenquit (Quit: Leaving.)
03:20:09  * inolenjoined
03:24:17  * TooTallNatequit (Quit: Computer has gone to sleep.)
03:37:45  * defunctzombie_zzchanged nick to defunctzombie
03:37:47  <trevnorris>othiym23: well, i'm fairly confident that I won't be able to work domains out of core until v0.13. but as it is, async listener doesn't add any performance hit, so hopefully everyone's cool w/ it going in.
03:40:36  <trevnorris>othiym23: I thought it'd be fairly trivial to go ahead and instantiate the net server beforehand. but since it's "net" and not "tcp", Node has to wait for the connection information.
03:42:26  <trevnorris>othiym23: but since Node allows the user to listen as a pipe or tcp it has to wait for the listening information to know what type of handle to setup.
03:45:06  <trevnorris>tjfontaine: sup dude?
03:50:14  * vptrjoined
03:51:49  <tjfontaine>nothing, sup with you
03:52:02  <tjfontaine>I mean still fucking with npm but whatever, took some time off it this weekend
03:52:04  * TooTallNatejoined
03:52:44  <groundwater>what's wrong with npm?
03:53:32  <tjfontaine>it's broken in 0.10.19
03:54:46  <groundwater>completely? or just some edge case
03:55:01  <tjfontaine>seems induced by some internet latency
03:55:32  * TooTallNatequit (Client Quit)
03:56:02  <groundwater>is it the 'cb() not called' error?
03:56:15  <groundwater>because i bumped into that several times today
03:56:31  <groundwater>i was in a cafe with shitty wifi
03:56:36  <tjfontaine>yes that it's it
03:56:51  <trevnorris>haven't been able to bisect it down to a few commits?
03:57:08  <groundwater>definitely cleared up when i came home
03:58:17  <tjfontaine>trevnorris: you mean in node?
03:58:48  <trevnorris>tjfontaine: um... is it a node issue or npm issue?
03:59:03  <tjfontaine>I'm about 90% sure it's an npm issue
03:59:11  <trevnorris>fun fun. :P
03:59:24  <tjfontaine>but if you're on node 0.10.18 or earlier `npm install -g npm` and find out :)
03:59:43  <groundwater>if you're trying to reproduce it, i can point you to the cafe where you'll hit it EVERY TIME
03:59:53  <trevnorris>haha.
04:00:00  <tjfontaine>heh I have it here as well, on multiple machines :)
04:01:07  <trevnorris>tjfontaine: well, in light of learning domains will not be easily removed, i've pretty much rewritten my v0.13 performance goals.
04:01:29  <trevnorris>tjfontaine: but i'm pretty sick of it for now. how do I reproduce this issue?
04:01:43  <tjfontaine>npm install some shit :)
04:01:59  <trevnorris>specific version of node?
04:02:44  <tjfontaine>0.10.19
04:02:46  <tjfontaine>or
04:02:49  <tjfontaine>npm install -g npm
04:03:32  * defunctzombiechanged nick to defunctzombie_zz
04:05:08  <trevnorris>tjfontaine: cool. i'll take a look at this, and how about you figure how how to rewrite {TCP,Pipe}Wrap so a base class can be instantiated immediately then the uv_{tcp,pipe}_ can be set later. :P
04:05:30  <tjfontaine>heh
04:05:40  <tjfontaine>I really don't want to context switch :)
04:06:12  <trevnorris>does the issue require a latency problem?
04:06:40  <tjfontaine>it happens in many circumstances, some times it feels like latency makes it worse
04:06:44  <trevnorris>ok
04:10:44  * defunctzombie_zzchanged nick to defunctzombie
04:10:58  <trevnorris>wow, so npm install newrelic gives me a new world of errors, but doubt any of them are what you're talking about.
04:11:07  <trevnorris>tjfontaine: so what error message should I be looking for?
04:11:10  <tjfontaine>hmm maybe I'm wrong
04:11:15  <tjfontaine>npm ERR! cb() never called!
04:11:15  <tjfontaine>npm ERR! not ok code 0
04:12:08  <trevnorris>tjfontaine: no, npm install newrelic installs like half a gig of dependencies.
04:12:16  <tjfontaine>I know what you're saying
04:12:24  <trevnorris>and clang was barfing on a bunch of stuff during build
04:12:45  <tjfontaine>I just put 1.3.8 npm into node-v0.10.19 and it still failed, so I'm going to try reverting some tls patches
04:13:01  * mikealquit (Quit: Leaving.)
04:13:54  * mikealjoined
04:16:48  <trevnorris>tjfontaine: so, from the error-handler.js seems that another error is occurring, but not being logged.
04:17:04  <trevnorris>so might be a double breakage
04:18:04  <tjfontaine>no, the event loop is exiting cleanly
04:18:11  <tjfontaine>but the last cb doesn't go through the error handler
04:22:40  <trevnorris>tjfontaine: the message appears because itWorked != true, but code === 0. and you don't seem to be getting an E* error message. so the error code being passed isn't recognized by npm
04:22:45  <trevnorris>at least, that's my take.
04:22:57  * defunctzombiechanged nick to defunctzombie_zz
04:23:47  <tjfontaine>yes, thus the event loop is exiting cleanly
04:23:54  <tjfontaine>but *not* going through errorHandler
04:24:03  <tjfontaine>which is where itWorked is set to true if there was no error
04:26:25  <trevnorris>tjfontaine: i'm confused by the logic, because errorHandler is only run when there is an error from the npm.load callback.
04:26:46  <trevnorris>so then itWorked could never be set to true
04:26:52  <tjfontaine>no I think it's point is really more like exitHandler
04:27:00  <tjfontaine>it's just a generic callback passed to all commands
04:28:58  <trevnorris>tjfontaine: did it dump a useful stack trace?
04:29:32  <tjfontaine>not really no, from there all we know is that we're in a clean exit path
04:30:15  <trevnorris>can you just inject a console.trace() into npm on exit so you always see how it's leaving?
04:30:45  <tjfontaine>what do you mean?
04:30:54  <tjfontaine>the stack trace is coming from the cc side of exit
04:31:22  <tjfontaine>from EmitExit
04:31:50  <trevnorris>yeah, but that message is coming from the process.on('exit' in lib/utils/error-handler
04:31:52  <tjfontaine>it's about as clean of an exit as we can get, there's no more work to be done
04:32:01  <tjfontaine>yes, because the event loop is empty
04:33:12  <trevnorris>ah yeah. so the stack trace will be empty. duh.
04:33:36  <trevnorris>if my logic paths are correct, the only way the callback could have been called w/o an error being passed is from the fs.writeFile in lib/install.js
04:34:03  * vptrquit (Ping timeout: 248 seconds)
04:34:50  <trevnorris>tjfontaine: and he has a dead code path in install() where if arguments.length === 3, then he reassigns cb_, but then it's never used.
04:35:40  <trevnorris>ah wait, nm. man that's confusing.
04:35:58  <trevnorris>ok. so he reassigns it, but it's used in a function declared in the same scope passed as cb
04:40:16  <trevnorris>tjfontaine: what's an example package you've installed and had that failure?
04:40:23  <trevnorris>i'm getting plenty of failures, but none like that one.
04:40:38  <tjfontaine>`npm install jsontool workflow vasync restify`
04:40:42  <tjfontaine>is what's in my test script
04:40:55  <tjfontaine>but also `npm install` from a git clone of sntp
04:41:31  * hueniverse1joined
04:42:18  * EhevuTovquit (Quit: This computer has gone to sleep)
04:42:52  <trevnorris>tjfontaine: well. that first one has given me 21k lines of npm-debug.log output so far, and it's not done
04:42:57  <trevnorris>no idea what's going on with my system.
04:44:06  * saghul_joined
04:45:29  <tjfontaine>man if increasing the size of this vm to speed the compiles makes it harder to hit this bug I'll be pissed
04:46:53  <tjfontaine>well good news
04:47:31  * CAPSLOCKBOTquit (Remote host closed the connection)
04:47:59  * CAPSLOCKBOTjoined
04:48:04  * LOUDBOT_joined
04:48:20  <tjfontaine>ya, I'm nearly convinced
04:49:10  <trevnorris>wtf. how are all these modules coming to half a gig. i've always enjoyed node for being so light weight.
04:49:36  * LOUDBOTquit (*.net *.split)
04:49:37  * saghulquit (*.net *.split)
04:49:38  * hueniversequit (*.net *.split)
04:49:40  * saghul_changed nick to saghul
04:49:46  <trevnorris>ah. mother. I have dev=true in my config
04:50:00  <tjfontaine>ok, so it does seem to be the tls stuff, and the fix from fedor already fixes this
04:50:19  <tjfontaine>there must be a setTimeout that's happening here and not being handled in npm
04:51:33  <trevnorris>cool. well, think building all that just fried all my sperm. glad you got it figured out.
04:51:53  <tjfontaine>haha
04:52:00  <tjfontaine>well I'm going to let this test script run over night
04:52:16  <tjfontaine>if it doesn't fail we can all thank fedor
04:52:24  <trevnorris>sounds good to me.
04:52:37  <tjfontaine>now on to the other things I was supposed to do this weekend
04:52:43  <trevnorris>hah, like what?
04:52:59  <tjfontaine>blog posts I've been promising people, or the mdb + manta job
04:53:10  <trevnorris>oh yeah.
04:53:40  <trevnorris>even i'm looking forward to those blog posts.
04:53:43  <trevnorris>dtrace is the shiz
04:53:58  <tjfontaine>hehe, well I'm starting with some mdb stuff first since I have those mostly written
04:54:05  <tjfontaine>actually it's
04:54:23  <tjfontaine>node debugging in parallel, mdb ::findleaks, mdb ::findjsobjects
04:54:57  <trevnorris>ah, cool. yeah, last time I tried to do that I got severely lost
04:55:00  <trevnorris>too many objects
04:55:05  <tjfontaine>nod
04:55:27  <trevnorris>that reminds me, I have some dtrace output to analize on the smartos server
04:55:47  <trevnorris>think I might actually be close to finding where that stupid pre 3.20 leak is coming from
04:55:57  <tjfontaine>what leak?
04:57:18  * AvianFluquit (Read error: Connection reset by peer)
04:57:29  * LOUDBOT_quit (Remote host closed the connection)
04:57:42  * LOUDBOTjoined
04:58:01  * AvianFlujoined
04:58:08  <trevnorris>tjfontaine: https://github.com/joyent/node/issues/5828#issuecomment-20846439
04:58:35  <tjfontaine>oh that one
05:00:05  <tjfontaine>I'm hoping to get some cores from walmart, or at least work with them on tracking that down, but it seems to be in the http client
05:00:17  <othiym23>trevnorris: yeah, the newrelic module has a billion development dependencies, sorry about that
05:00:39  <tjfontaine>I mean I know their example is in the server, but I think what wpreul is seeing is in the client
05:00:41  <othiym23>trevnorris: its runtime dependencies are just bunyan (version-locked to the version before it depended on dtrace-node) and CLS
05:00:48  <trevnorris>othiym23: eh, no worries. don't know how my global config got dev=true anyways.
05:01:51  <tjfontaine>anyway while I have motivation, time to write
05:02:10  <othiym23>trevnorris: I'm totally fine with just the asyncListener stuff landing in 0.12, everyone else can fend for themselves O:)
05:02:59  <trevnorris>othiym23: well, if you haven't looked recently i've completely updated https://github.com/joyent/node/issues/6252
05:03:13  <trevnorris>othiym23: there's a ton of stuff I want to get done before v1.0
05:03:37  <trevnorris>but I think the asyncListener stuff will probably have to ship in v0.12 w/ the domain code.
05:03:44  <trevnorris>removing them is going to be a pain like no other.
05:04:14  <trevnorris>there are some more optimizations I can do, but want an initial review first.
05:04:48  <othiym23>trevnorris: once I have New Relic for Node.js 1.0 out the door, I am going to set aside some time to contribute more to Node core
05:04:59  <othiym23>trevnorris: and at that point I should be able to help you out with the domains stuff
05:05:25  <othiym23>that'll be towards the end of October, tho, so probably in a 0.13 timeframe
05:05:52  <trevnorris>othiym23: well, funny part is the domain work actually has nothing to do directly with domains. I need to redo a bunch of the guts to make it so I can instantiate the c++ class at the same time as the js instance.
05:06:30  <trevnorris>when that's done then it'll be a snap to remove domain code.
05:07:52  <othiym23>well, I'm not *completely* dumb when it comes to C++, it's just been a while since I've done a lot of it seriously
05:08:04  <othiym23>and at the very least I can contribute tests and testing
05:08:11  <trevnorris>hah, don't know if you could say I've ever done it seriously. ;P
05:09:05  <trevnorris>othiym23: but thanks much for your help. those two nights you helped me figure out some really stupid bugs on my part.
05:09:14  <trevnorris>but it seems to be working now as intended.
05:09:32  <trevnorris>and meets my minimum at performance, so i'd say it's a win-win so far.
05:09:46  <othiym23>I am very pleased with how this works out
05:10:06  <othiym23>so far, unless people are doing really weird stuff, they should be able to use CLS seamlessly across 0.10 and 0.12
05:10:23  <trevnorris>that's great.
05:10:42  <trevnorris>and as far as 0.12. there'll be nothing you can't see.
05:11:01  <trevnorris>on reason i'm digging it is not I have a single point of reference for dtrace probes.
05:11:49  <trevnorris>s/on/one
05:12:09  <othiym23>I want to write some example listeners when I have a little more time
05:12:25  <othiym23>I highly doubt most people are going to understand what the hell this thing is for one 0.12 lands
05:12:25  <trevnorris>i'm interested to see how you plan on using it.
05:13:03  <othiym23>so I'll write a little long-stacktrace module, and an even lighter-weight CLS that all fits on a single screen (which is something creationix already wrote), and a little batch-oriented logging module
05:13:30  <othiym23>maybe even a lower-level transaction tracer than the one I'm using for New Relic that captures everything
05:13:40  <othiym23>those should all be pretty simple to write with what we have now
05:14:57  <trevnorris>coolio. there'res one more api that needs to be added. that's the ability to add/remove a listener from an existing handle.
05:16:18  <trevnorris>the other one i've played around with is the ability to add native callbacks, c++ methods, that'll run instead of js functions. it'd be a freak a lot faster for specific cases. though think that's probably creep
05:17:34  <othiym23>trevnorris: being able to add a listener to an existing handle is an interesting idea -- does it have a use case?
05:17:40  <othiym23>or are you just adding that for completeness?
05:17:46  <trevnorris>othiym23: yeah. for domains.
05:18:06  <trevnorris>if i'm going to fully emulate domains then i'll need the ability to add a listener object to an existing handle.
05:18:31  <trevnorris>if not for that then I wouldn't take the time.
05:20:16  <othiym23>when I'm not under the gun, I need to sit down and try to chart out all the different ways Node internally uses EEs, and see if I can break those down into different patterns
05:20:45  <othiym23>and maybe start separating some of the concerns there and thinking reasonably about how to bridge different asynchronous realms in ways that are easier to reason about
05:21:21  <trevnorris>othiym23: on that issue did you read my plan for the future of internal emitters?
05:22:09  <trevnorris>i want to standardize the api we use directly on top of the *Wraps, and use that internally. then event emitters can sit on top of that as well
05:22:26  <trevnorris>so both domains and the event emitter could technically live in userland.
05:24:28  <othiym23>looking at it now
05:25:33  <othiym23>trevnorris: do you have numbers (even rough ones) of the difference in execution time between Function#call and Function#apply?
05:25:45  <othiym23>I use .apply all over the place
05:26:26  <othiym23>I went through the newrelic code base yesterday and replaced all my [].slice.call usage with a handwritten slice() function, good times
05:26:50  <trevnorris>othiym23: it depends. from js to js, they're comparable. but for some reason calling apply from a function invoked from c++, and using "arguments" makes the call take 4-5x's longer.
05:28:12  <trevnorris>othiym23: one thing you need to take into account is the polymorphism the called function is experiencing from the use of call/apply. if the argument order/types are changing often then v8 can't properly IC those methods.
05:28:27  <trevnorris>and any use of call/apply automatically rules out the use of inlining.
05:31:13  * paddybyersjoined
05:31:34  <othiym23>yeah, I've paid some attention to monomorphism in my dev efforts, mostly just trying to make sure I don't do any variadic cleverness with my own stuff
05:31:35  <trevnorris>othiym23: but there are some things. like in the case you _have_ to use a function closure. it's better to put a one liner that calls out to another more complex function and that passes the context (e.g. "this") as an argument parameter.
05:31:54  <othiym23>I haven't done any serious testing of that, though -- creating a benchmarking suite like core's is on my list of things to do
05:32:21  <othiym23>just running the integration tests with --trace-deopt turned on doesn't really give me any surprises, although I'm still learning how to interpret what trace-deopt is telling me
05:33:12  <trevnorris>you'll want to run that with --code-comments
05:34:53  <trevnorris>othiym23: I usually run with "--trace-opt-verbose --trace-deopt --trace-inlining --code-comments"
05:35:25  <trevnorris>if you see a wall of constantly re-evaluated methods it means you have a function closure somewhere.
05:35:56  <trevnorris>also there's the basic --prof which you can run against v8's tools/linux-tick-processor that'll give you some good info
05:36:42  <trevnorris>but for a full swath, run --ll_prof (reference the comments in tools/ll_prof.py on how to use..
05:36:58  <othiym23>the problem with --prof is that I'm testing a lot of code that's not my own
05:37:07  <othiym23>I need a more controlled benchmarking environment
05:37:33  <othiym23>and yeah, I use --trace-opt and --code-comments, and mostly it just tells me that NativeModule.require flaps a lot
05:37:54  <trevnorris>ben wrote a perf module that'll allow you to start it lazily and you can start/stop it when needed.
05:38:02  <othiym23>in fact, the --trace commands aren't super useful to use on non-core code because most of the output you see is coming from inside core because those are the hottest / most heavily hit paths
05:38:26  <othiym23>that would help me more if I didn't develop on OS X ;)
05:38:41  <trevnorris>heh
05:39:07  <othiym23>what I'm going to do when I do that is just clean up my SmartOS vms or take tjfontaine's advice and use manta / some SmartMachines and do all my benchmarking and profiling on a platform that has good support for it
05:39:10  <trevnorris>othiym23: well, you could always try irhydra. just need to build core with: GYP_DEFINES="v8_enable_disassembler=1 v8_object_print=1" ./configure; make -j8"
05:39:27  <trevnorris>othiym23: then run your code with --trace-hydrogen --trace-deopt --code-comments --print-opt-code --trace-phase=Z
05:39:40  <trevnorris>pipe stdout to code.asm
05:39:58  <trevnorris>then go to http://web.irhydra.googlecode.com/git/hydra.html and load your compilation artifacts
05:40:11  <trevnorris>(i.e. code.asm and hydrogen*)
05:40:14  <othiym23>yeah, I've only played with IRHydra a little bit (mostly following along with mraleph's blog posts), and it seems neat, but I feel like I have a ways to go before I really need to get that into the nitty-gritty of the New Relic stuff's overhead
05:40:31  <trevnorris>hah, fair enough.
05:40:31  <othiym23>right now I think it's probably < 1% overhead imposed by the agent, and that's acceptable for NR (and most of NR's customers)
05:40:59  * loladiroquit (Quit: loladiro)
05:41:17  <othiym23>and there's obvious stuff to clean up, like minimizing object creation and trying to find more ways to eliminate creating closures
05:43:04  <trevnorris>yeah. right now i've gotten async listeners to have just about as much minimal overhead as possible. though, there's still not a fast way to get from c++ to js from the event loop.
05:43:54  <trevnorris>going through Function::Call() takes a minimum of 50ns, where as js to c++ can be as few as 8ns.
05:44:02  <trevnorris>really annoying.
05:44:26  <trevnorris>othiym23: do you have any native code in your codebase?
05:45:56  <trevnorris>tjfontaine: so, from dtrace I see "onread at net.js position 25676"
05:46:04  <trevnorris>does that mean line 25...k
05:46:15  <othiym23>trevnorris: no (mostly to keep deployment simple)
05:46:30  <trevnorris>othiym23: ok. so i'll shut up about all that stuff :)
05:46:57  <othiym23>trevnorris: although I have a longish list of metrics that it would be good to gather that can only be reached from C++-land
05:47:12  <trevnorris>othiym23: like what?
05:47:43  <othiym23>trevnorris: once Azure has figured out how to host precompiled binaries for npm deployment, I'll probably take another look at it
05:48:55  <othiym23>trevnorris: real cputime, better memory usage metrics, GC tracing (without forcing people to start with --expose-gc, because nobody reads the docs), V8 heap vs Node heap
05:49:20  <othiym23>maybe some libuv counters if it's possible to implement those without slaughtering performance
05:49:44  <trevnorris>i think we already have a bunch of probes in core for stuff like that.
05:49:48  <othiym23>you know, the kinds of things it's trivial to grab with DTrace but that it's nearly impossible to get with pure JS
05:50:06  <othiym23>yeah, the question is how to get at it in-process
05:50:21  <othiym23>NR can't dictate peoples' runtime environment
05:50:33  <othiym23>a distressingly high number of Node users deploy to Heroku because of their free tier
05:51:14  <othiym23>and I know the person maintaining the Node buildpack over at Heroku, and Heroku and New Relic are partners, but getting perf or systemtap turned on for individual dynos sounds like something approaching real work
05:53:20  * EhevuTovjoined
05:57:56  <trevnorris>tjfontaine: quick context switch: here's some dtrace output from mmap after hammering the server and forcing rss to grow beyond what it should've been able to: https://gist.github.com/trevnorris/6759810
05:58:04  <trevnorris>tjfontaine: any tips on here to look next?
06:05:47  * trevnorris&
06:05:48  <LOUDBOT>HOLY SHIT IT'S CARNIES AND THEY FEAST ON YOUR FLESH
06:10:07  * rendarjoined
06:10:20  * icarotjoined
06:13:08  * icarot2joined
06:14:48  * icarotquit (Ping timeout: 256 seconds)
06:32:08  * icarotjoined
06:36:01  * icarot2quit (Ping timeout: 245 seconds)
06:41:20  <MI6>nodejs-v0.10-windows: #235 UNSTABLE windows-ia32 (7/600) windows-x64 (8/600) http://jenkins.nodejs.org/job/nodejs-v0.10-windows/235/
06:55:41  * icarotpart ("Leaving")
07:00:54  * rendarquit
07:02:51  * rendarjoined
07:44:36  * dominictarrquit (Ping timeout: 245 seconds)
07:49:19  * dominictarrjoined
07:50:29  <`3rdEden>trevnorris: I was looking through your v0.13 optimisation check list, you might want to add removing .bind calls from the core as well
07:52:07  <`3rdEden>There are a couple of .bind calls in the fs module, console and __debugger. Changing them to variable references or just wrapping them with a plain function that does a .call would be much more efficient
08:07:34  * dominictarrquit (Quit: dominictarr)
08:19:57  * wolfeidau_joined
08:20:32  * AvianFluquit (Remote host closed the connection)
08:21:02  * hzjoined
08:22:00  * AvianFlujoined
08:22:09  * AvianFluquit (Remote host closed the connection)
08:25:00  * dominictarrjoined
08:29:03  * dominictarrquit (Client Quit)
08:29:30  * dominictarrjoined
08:33:50  * bnoordhuisjoined
08:33:59  * dominictarrquit (Ping timeout: 260 seconds)
08:43:31  * hzquit
08:45:39  * amartensquit (Quit: Leaving.)
08:49:18  * bnoordhuisquit (Ping timeout: 245 seconds)
08:55:17  * bnoordhuisjoined
09:04:55  * hzjoined
09:16:00  * amartensjoined
09:22:31  * amartensquit (Ping timeout: 245 seconds)
09:39:40  * wolfeidau_quit (Quit: Leaving)
09:44:28  * bnoordhuisquit (Ping timeout: 256 seconds)
09:50:05  * bnoordhuisjoined
10:20:38  * amartensjoined
10:25:03  * amartensquit (Ping timeout: 260 seconds)
10:33:17  * abraxasquit (Remote host closed the connection)
10:33:50  * abraxasjoined
10:35:49  * bnoordhuisquit (Remote host closed the connection)
10:38:05  * abraxasquit (Ping timeout: 240 seconds)
10:44:25  * Kakerajoined
10:46:37  <MI6>nodejs-v0.10: #1509 UNSTABLE smartos-x64 (2/600) http://jenkins.nodejs.org/job/nodejs-v0.10/1509/
11:10:20  * loladirojoined
11:12:58  * loladiroquit (Client Quit)
11:20:56  * amartensjoined
11:25:21  * amartensquit (Ping timeout: 252 seconds)
11:41:35  * vptrjoined
11:41:55  * vptrquit (Changing host)
11:41:55  * vptrjoined
12:21:16  * amartensjoined
12:25:28  * amartensquit (Ping timeout: 240 seconds)
12:35:29  * M28quit (Read error: Connection reset by peer)
12:41:02  * vptrquit (Quit: WeeChat 0.3.5)
13:11:52  * mikealquit (Quit: Leaving.)
13:14:04  * EhevuTovquit (Quit: This computer has gone to sleep)
13:17:42  * mikealjoined
13:17:57  * mikealquit (Client Quit)
13:21:35  * amartensjoined
13:22:54  * mikealjoined
13:25:51  * amartensquit (Ping timeout: 245 seconds)
13:31:00  * dominictarrjoined
13:38:55  * loladirojoined
14:00:29  * loladiroquit (Quit: loladiro)
14:06:27  * mikealquit (Quit: Leaving.)
14:11:54  * dominictarrquit (Quit: dominictarr)
14:14:15  * mikealjoined
14:21:57  * amartensjoined
14:24:37  * mikealquit (Quit: Leaving.)
14:26:23  * amartensquit (Ping timeout: 245 seconds)
14:27:51  * mikealjoined
15:03:58  * vptrjoined
15:16:31  * AvianFlujoined
15:17:42  <MI6>nodejs-master: #586 UNSTABLE smartos-x64 (6/643) http://jenkins.nodejs.org/job/nodejs-master/586/
15:20:53  * loladirojoined
15:22:17  * amartensjoined
15:26:25  * amartensquit (Ping timeout: 246 seconds)
15:28:46  * inolenquit (Quit: Leaving.)
15:41:59  * c4milojoined
15:50:20  * mikealquit (Quit: Leaving.)
15:51:15  * amartensjoined
16:07:21  * defunctzombie_zzchanged nick to defunctzombie
16:08:56  * defunctzombiechanged nick to defunctzombie_zz
16:20:49  * inolenjoined
16:21:04  * mikealjoined
16:21:30  * AvianFlu_joined
16:24:43  * AvianFluquit (Ping timeout: 245 seconds)
16:26:35  * AvianFlu_changed nick to AvianFlu
16:27:53  * octetcloudjoined
16:28:20  * groundwaterquit (Quit: groundwater)
16:29:15  * vptrquit (Ping timeout: 240 seconds)
16:30:14  * mikealquit (Ping timeout: 240 seconds)
16:35:08  * kellabytequit (Quit: Quit)
16:39:00  <tjfontaine>trevnorris: I'll review your comments and see :)
16:49:50  * hueniverse1quit (Read error: Connection reset by peer)
16:52:49  * hueniversejoined
16:54:11  * octetcloudquit (Quit: WeeChat 0.4.1)
16:54:40  * amartensquit (Quit: Leaving.)
16:55:00  <tjfontaine>hueniverse: do you happen to know which test in sntp in particular was causing that abort, if not it's fine I will track it down, we're going to be doing a release of 0.10 today and I'd like to include that fix
16:55:26  * vptrjoined
16:55:29  * kellabytejoined
16:55:45  * octetcloudjoined
16:55:58  <hueniverse>tjfontaine: I think it's a combination of tests. If you only run one test at a time it doesn't happen (at least last time I did that)
16:56:27  <tjfontaine>ok
16:56:40  <hueniverse>tjfontaine: also, I found the problem that was causing the test tool to fail - I removed the listeners after socket.close() but still got an error after that
16:57:15  <hueniverse>tjfontaine: which came in after the wrapping domain already exited
16:57:43  <hueniverse>tjfontaine: that is actually a much more annoying problem
16:58:01  <tjfontaine>hrm, can you file an issue on that?
16:58:14  <hueniverse>tjfontaine: and the only way I know how to solve it is to add an empty static listener on error on sockets I just don't care about anymore
16:58:22  <tjfontaine>right
16:59:06  <hueniverse>I'm not sure this is an issue. is there any guarantee that once 'close' is called, there will not be an error event?
16:59:31  <hueniverse>I even tried waiting for the 'close' event to remove the listeners inside the cb, but that didn't help either
16:59:46  <tjfontaine>at the moment I'm not sure, but to me close should signify that no other events should happen
16:59:52  * octetcloudquit (Client Quit)
16:59:59  <hueniverse>yeah...
17:00:11  <tjfontaine>{error,data,timeout}->end->close
17:00:15  <tjfontaine>to me
17:00:33  <tjfontaine>the sntp failure is inside the #offset set of tests
17:01:14  <hueniverse>at one point I found the first test that was failing, then isolated it
17:01:18  <hueniverse>then it didn't fail
17:01:23  <hueniverse>but when I c/p it three times
17:01:28  <hueniverse>it fails on the second and third
17:01:37  <tjfontaine>ok
17:02:03  <hueniverse>note that you need to get sntp 0.x and lab 0.1.6 for this to happen
17:02:07  <hueniverse>master is now fixed
17:02:12  <hueniverse>"fixed"
17:02:28  <tjfontaine>right I still have my checkout from the weekend
17:02:39  <hueniverse>lmk if I can help
17:02:46  <tjfontaine>have you run the broken commit against master?
17:03:21  <hueniverse>against node master?
17:03:24  <tjfontaine>yes
17:03:28  <hueniverse>nope
17:03:30  <tjfontaine>k
17:04:29  <hueniverse>are you trying to figure out why it aborted or why the error is happening in the first place?
17:04:46  <hueniverse>I think it is aborting because no one is listening on error
17:05:33  <tjfontaine>right, but because there was an error it's trying to introspect into the handle, even though we're not ready for it
17:06:04  <tjfontaine>util.inspect() of an error'd dgram socket shouldn't result in an abort :)
17:06:28  <hueniverse>k
17:06:44  <tjfontaine>hueniverse: what error was it throwing in particular, do you recall?
17:07:05  <hueniverse>hmm
17:07:09  <hueniverse>I need to play with it again
17:07:16  <tjfontaine>sok, I can probably convince it
17:07:18  <hueniverse>I got so many odd results playing with it
17:08:05  <tjfontaine>weird, it's not erroring for me now in #offset
17:08:10  <tjfontaine>or aborting rather
17:08:18  <tjfontaine>curious
17:08:48  * loladiroquit (Quit: loladiro)
17:10:05  * brsonjoined
17:13:00  * groundwaterjoined
17:13:05  * octetcloudjoined
17:13:06  * Damn3dquit (Ping timeout: 245 seconds)
17:14:09  * EhevuTovjoined
17:17:40  * Damn3djoined
17:20:01  <hueniverse>same version from sat?
17:20:34  <tjfontaine>yes, so it's erroring in getaddrinfo
17:21:02  <tjfontaine>I'll see if I can make a test case that matches this
17:21:51  <hueniverse>do you need help reproducing the abort?
17:22:21  <tjfontaine>only for my test case
17:23:00  <tjfontaine>who is responsible for doing a bind for sntp?
17:23:40  <tjfontaine>oh this is only outbound I guess
17:25:21  <tjfontaine>hmm, should this even be emit'ing on the socket? it should only be delivered to the calling function I think
17:26:30  * AvianFluquit (Remote host closed the connection)
17:26:33  <hueniverse>not sure what you mean
17:27:00  * loladirojoined
17:27:28  * mikealjoined
17:27:32  <tjfontaine>https://github.com/joyent/node/blob/v0.10/lib/dgram.js#L293-L311
17:28:01  <tjfontaine>if we error on looking up who you're trying to send to, we emit on the socket, vs passing the error to the send callback
17:28:46  * bnoordhuisjoined
17:29:28  * AvianFlujoined
17:29:47  * amartensjoined
17:31:41  * mikealquit (Ping timeout: 245 seconds)
17:31:46  * EhevuTovquit (Quit: This computer has gone to sleep)
17:42:47  <tjfontaine>hueniverse: why is this thing a heisenbug, argh.
17:44:02  <othiym23>tjfontaine: is it racy?
17:44:06  <tjfontaine>it must be
17:44:08  <othiym23>sounds like it's racy
17:44:17  <trevnorris>morning
17:44:23  <othiym23>mornin trevnorris
17:44:37  <tjfontaine>a race between the ._handle being instantiated and the error being fired, so just a tick or two off
17:45:12  <trevnorris>bnoordhuis: problem instantiating the *Wrap class at time of JS instantiation. net module allows to create a pipe or tcp at .listen() time. so right now it's impossible to do so.
17:45:44  <trevnorris>bnoordhuis: i'd like to figure out a way for this to be possible, because the implications would be beneficial performance-wise, but not sure how to yet.
17:47:07  * bnoordhuisquit (Ping timeout: 246 seconds)
17:47:22  <trevnorris>`3rdEden: thanks for the heads up. i'll add that to the list.
17:47:54  <trevnorris>othiym23: anything new going on?
17:48:09  <othiym23>trevnorris: about to add a .bindEmitter method to CLS
17:48:37  <othiym23>trevnorris: your interface for dealing with Emitters is meant for the C++ side of things, right?
17:49:22  <trevnorris>othiym23: the async listener api is completely separated from the EventEmitter. since EE has actually nothing directly to do with the async event.
17:49:29  <othiym23>also I am enjoying some super tasty stoner metal, but that's neither here nor there
17:49:34  <tjfontaine>"We should not expose a raw pointer to typed array's backing store." sad face
17:49:37  <trevnorris>hah
17:49:52  <trevnorris>tjfontaine: what's that from?
17:49:57  <tjfontaine>trevnorris: v8 commits
17:50:07  <trevnorris>ah, bugger. though I thought there's an api for that.
17:50:45  <tjfontaine>trevnorris: https://codereview.chromium.org/25221002 it's only on the ArrayBufferView
17:50:55  <othiym23>trevnorris: I was talking about #6252 -- I know they're separate concepts, but I want to mash them up ;)
17:51:38  <trevnorris>tjfontaine: ah, of the view itself, not the backing data.
17:51:53  <tjfontaine>ya, but still a nice helper nonetheless
17:52:00  <tjfontaine>saves a lookup of the base
17:52:10  <trevnorris>oh, I agree. just wonder why they felt the need to remove it.
17:53:05  <tjfontaine>was anyone else here able to reproduce the npm cb error themselves?
17:53:12  * EhevuTovjoined
17:53:19  <trevnorris>sorry, nope.
17:53:24  <tjfontaine>I would really like another sanity check before I cut this :/
17:53:40  <othiym23>tjfontaine: what's the issue?
17:54:04  <tjfontaine>othiym23: `npm install|update` resulting in the `npm ERR cb() never called!`
17:54:06  <othiym23>I saw your comment on the bug, but you were all "the solution to this problem is too long to write in the margins of this bus ride"
17:54:21  <tjfontaine>oh ok
17:54:21  <tjfontaine>so
17:54:22  <othiym23>sorry, shoulda clarified: what's the issue in Node that's causing the bug?
17:54:37  <tjfontaine> joyent/node#6270
17:54:38  <tjfontaine>that one
17:54:46  * TooTallNatejoined
17:54:59  <tjfontaine>a tls race, that some people more than others were able to hit
17:55:15  <MI6>libuv-master: #263 UNSTABLE windows (3/195) smartos (2/194) http://jenkins.nodejs.org/job/libuv-master/263/
17:55:25  <tjfontaine>network latency, sets of packets, cpu availability of crypto routines
17:55:30  <tjfontaine>who knows what caused the race for them
17:56:03  <othiym23>is there a way to slow down / speed up the passage in compiled applications to make races more likely to show up?
17:56:21  <othiym23>the passage *of time
17:56:29  <tjfontaine>if I can figure out which timer in npm is firing to cause this, then it should be easy for npm to also fix this problem going forward
17:56:44  * EhevuTovquit (Client Quit)
17:56:46  <tjfontaine>I was assuming it was a net/http.setTimeout
17:57:10  <tjfontaine>but NODE_DEBUG="net http https" wasn't helpful there
17:58:12  <tjfontaine>and hilariously NODE_DEBUG="timers" moves the goal post for when it fails
17:59:41  <trevnorris>tjfontaine: not that it matters, but I found this interesting: https://gist.github.com/trevnorris/6767620
18:00:13  <tjfontaine>how did you manage that
18:00:22  <trevnorris>npm install sntp
18:00:24  <tjfontaine>oh valgrind
18:00:29  <tjfontaine>totally valgrinds fault there
18:00:46  <tjfontaine>that fucker just gets in the way of everything
18:00:48  <trevnorris>oh really? was just looking for a way to slow down the application, and valgrind seemed like a great way.
18:00:51  <trevnorris>hahaha
18:01:02  <tjfontaine>yes well you managed to slow it down, but introducing more bugs :)
18:01:33  <trevnorris>i do find the output for ZCtx::Process() interesting.
18:02:01  <trevnorris>having that issue on master currently, where npm install just dies during the decompress phase
18:02:11  <tjfontaine>ya that could be a legit bug
18:02:25  <tjfontaine>just don't really want to have to use valgrind to evoke it :P
18:02:43  <trevnorris>nod
18:08:19  <trevnorris>tjfontaine: fwiw, the zlib issue on master is pretty bad. I can hardly ever run npm install on master w/o zlib dieing in one of several ways. though it's probably not related to what you're trying to fix.
18:08:49  <MI6>libuv-node-integration: #248 UNSTABLE smartos-x64 (6/643) http://jenkins.nodejs.org/job/libuv-node-integration/248/
18:09:05  <tjfontaine>trevnorris: ok I will play before releasing another
18:09:54  <trevnorris>tjfontaine: the errors I get are either ZCtx aborts, or I get a sha sum failure. (of course, have to clear our my ~/.npm/ dir before every run.
18:13:51  <trevnorris>othiym23: so I wrote a quick script to keep infinitely long stack traces for every async event. yeah... didn't run very fast :P but sure you're already expecting that.
18:20:31  <tjfontaine>you're fucking winning this game npm, I hope you're fucking proud of it
18:30:12  <othiym23>trevnorris: yeah, long stacktraces are just slow
18:30:28  <othiym23>trevnorris: but there are times when they're pretty much the fastest way to debug an error
18:34:11  <hueniverse>tjfontaine: been getting the npm cb() errors all weekend
18:34:18  <hueniverse>had to downgrade to 10.18
18:35:31  <tjfontaine>hueniverse: yes, would you like to try a build with the fix?
18:36:05  <hueniverse>tjfontaine: are you asking if I want to or willing to? :-)
18:36:20  <tjfontaine>are you willing and able :)
18:37:29  <hueniverse>just master?
18:37:35  <tjfontaine>v0.10 HEAD
18:37:45  <hueniverse>I guess
18:38:02  * AvianFluquit (Remote host closed the connection)
18:38:27  <hueniverse>I feel like the prettiest girl at the prom this am... but all the boys have bad intentions
18:38:39  <othiym23>is that a good feeling or a bad feeling?
18:38:40  * kellabytequit (Changing host)
18:38:40  * kellabytejoined
18:38:40  * kellabytequit (Changing host)
18:38:40  * kellabytejoined
18:38:41  <hueniverse>good thing I slept in late
18:38:53  <othiym23>a real question here about the kind of prom date you are
18:38:58  <hueniverse>othiym23: starts good but ends bad :-)
18:39:11  <tjfontaine>should I have spiked the punch or not?
18:39:13  <hueniverse>othiym23: never been to a prom
18:39:41  <tjfontaine>always a brides maid, never the bride?
18:39:57  <hueniverse>no proms in Israel
18:40:00  * mikealjoined
18:41:28  <othiym23>hueniverse: not really missing much, if my high school years were any guide
18:42:09  <tjfontaine>othiym23: what, it's not like a john hughes film?
18:42:10  <trevnorris>hueniverse: have a specific thing you're getting the error on? been trying for hours and can't reproduce the error
18:44:03  <tjfontaine>you're definitely on node v0.10.19 proper, right trevor?
18:44:29  <trevnorris>yeah
18:44:41  <tjfontaine>on your local box? have you tried a vm?
18:44:49  <hueniverse>trevnorris: you mean the npm cb() errori?
18:44:51  <hueniverse>error
18:45:08  <trevnorris>hueniverse: yeah, can't reproduce the error on 0.10.19
18:45:25  <hueniverse>on one vm I get it whenever I clone hapi and try to do npm install
18:45:27  * AvianFlujoined
18:45:27  <trevnorris>i'm getting a bunch of other errors on master, but nothing on 0.10.19
18:45:34  <hueniverse>on another I got it when I tried to publish hawk
18:45:52  <hueniverse>going to see if I can reproduce the hapi one
18:46:35  <tjfontaine>trevnorris: [email protected] has 0.10.19 installed, you should be able to reproduce it easily there
18:46:43  <trevnorris>tjfontaine: thanks
18:46:49  <tjfontaine>but that doesn't help me verify that I'm not crazy :)
18:49:09  * groundwaterquit (Ping timeout: 248 seconds)
18:50:41  <trevnorris>tjfontaine: you currently on that server?
18:50:58  <tjfontaine>yes
18:51:27  <trevnorris>finally!!!!! great. now at least I've seen the error :P
18:51:30  <tjfontaine>I'm not doing anything atm
18:51:45  * groundwaterjoined
18:53:37  <trevnorris>ok, well i never realized that error was literally the only output.
18:53:43  <tjfontaine>:)
18:53:50  <trevnorris>i thought it happened after a module was retrieved or something
18:54:15  <tjfontaine>and for people with no familiarity with the npm spider base it's a bitch
18:55:26  <trevnorris>that stupid .lock file.
18:55:28  <trevnorris>hate that thing
18:56:03  <tjfontaine>so what you'll experience is you'll see npm appear to hang for a bit on a module, and then boom it dies
18:56:35  <tjfontaine>and then you see it's https, and you're like, yup ... that's what fedor fixed
18:56:55  <tjfontaine>but, I need to convince myself further and that I'm not just kicking the issue down the curb for a bit
18:57:18  <tjfontaine>if I can
18:59:01  <trevnorris>tjfontaine: have you run it with --trace_all?
18:59:15  <tjfontaine>no, but have fun sifting through that
18:59:21  <trevnorris>:)
18:59:27  <tjfontaine>I have been using various NODE_DEBUG's and then adding more
18:59:39  * wavdedjoined
18:59:43  <tjfontaine>up next is to debug around the tls bug that fedor fixed, to prove it to me that way
18:59:54  <tjfontaine>but I was hoping to see a timer fire
19:01:43  <hueniverse>I can't reproduce the npm error on head or 10.19
19:01:55  <hueniverse>and don't have time to keep playing with it - sorry
19:01:57  <tjfontaine>hueniverse: right, frustrating
19:02:03  <tjfontaine>hueniverse: gbtw :P
19:02:21  <tjfontaine>hueniverse: appreciate you spending any time on it
19:02:29  <hueniverse>np
19:03:32  <trevnorris>tjfontaine: so, what's the "fix"
19:03:49  <tjfontaine>trevnorris: https://github.com/joyent/node/commit/671b5be6e9d74fec9e94b1ab88d2b2648c540078
19:03:59  <MI6>nodejs-master-windows: #379 UNSTABLE windows-x64 (23/643) windows-ia32 (20/643) http://jenkins.nodejs.org/job/nodejs-master-windows/379/
19:04:10  <tjfontaine>brb lunch
19:05:54  * groundwaterquit (Quit: groundwater)
19:14:04  * defunctzombie_zzchanged nick to defunctzombie
19:14:21  * bnoordhuisjoined
19:16:10  <bnoordhuis>hello people
19:16:53  <trevnorris>bnoordhuis: hello dude
19:17:09  <trevnorris>bnoordhuis: catch my last msg just before you popped off?
19:21:13  <bnoordhuis>trevnorris: no. what was it?
19:22:14  <trevnorris>bnoordhuis: for domains to be removed from the c++ code we'll need the c++ class instantiated at the same time as their event emitter js counterparts. you said that was fine, but
19:22:39  <trevnorris>bnoordhuis: since net.createServer allows a tcp or pipe, I can't tell which to call when the server is created in js
19:22:49  <trevnorris>hence why it needs to wait until .listen.
19:23:22  <trevnorris>bnoordhuis: so basically, i'm at a bit of an architecture hard wall and was wondering if you had any feedback to getting around this.
19:24:15  <bnoordhuis>trevnorris: not really, sorry :-/
19:24:57  <trevnorris>only thing I've been able to come up with is create a TCPPipeWrap and just don't call uv_{tcp,pipe}_init until .listen is called.
19:25:06  <trevnorris>but at least the c++ class will have been instantiated.
19:25:13  <trevnorris>feels icky though
19:25:18  <bnoordhuis>indeed
19:26:09  * EhevuTovjoined
19:26:10  * bradleymeckjoined
19:27:14  <trevnorris>bnoordhuis: thanks for all your help thus far. i'm interested to see what you say about https://github.com/joyent/node/pull/6011
19:27:14  <trevnorris>as it's all part of my grand scheme in https://github.com/joyent/node/issues/6252
19:29:02  * EhevuTovquit (Client Quit)
19:30:48  <bnoordhuis>let me take a peek
19:31:06  <bnoordhuis>1817 additions, 514 deletions!
19:31:11  <trevnorris>:)
19:31:18  <trevnorris>it's taken me 2 months to do that
19:36:59  * TooTallNatequit (Quit: Computer has gone to sleep.)
19:37:37  * wavdedquit (Quit: Hasta la pasta)
19:52:09  * hzquit (Read error: Connection reset by peer)
19:52:37  * hzjoined
19:52:55  * icarot2joined
19:54:00  <trevnorris>othiym23: I guess there is one more option. inject async code into places like .listen() that check if the ee.domain exists, and if it does then load them into the wrap.
19:54:24  <trevnorris>othiym23: it's the least icky of the solutions I've come up with.
19:56:00  * defunctzombiechanged nick to defunctzombie_zz
19:59:18  * julianduquejoined
20:00:18  <tjfontaine>bnoordhuis: no reason to do another libuv release, right since there's only the revert
20:02:01  <bnoordhuis>tjfontaine: correct
20:02:18  <tjfontaine>bnoordhuis: were you ever hitting the npm cb error?
20:02:47  <bnoordhuis>tjfontaine: no. but then i don't install that many modules
20:02:50  <bnoordhuis>only hoarders
20:02:57  <tjfontaine>heh
20:04:09  * dominictarrjoined
20:04:14  <wolfeidau>tjfontaine: Did you need someone that fails with the "npm ERR! cb() never called!" ?
20:04:26  <bnoordhuis>trevnorris: did a partial review, not too many comments so far
20:04:40  <trevnorris>bnoordhuis: that's good to know. :)
20:08:17  * icarot2part ("Leaving")
20:09:23  <tjfontaine>wolfeidau: if you are still experiencing it yes, please test v0.10 HEAD and verify that it's gone there
20:09:56  <wolfeidau>tjfontaine: yeah pretty much every time i run npm cache clean and npm install on a code base i get the error
20:12:18  <tjfontaine>wolfeidau: ok if you try v0.10 HEAD, please let me know how that works out :)
20:12:37  <wolfeidau>yeah np
20:12:59  * defunctzombie_zzchanged nick to defunctzombie
20:13:19  <trevnorris>bnoordhuis: so you'd be cool if I did something similar with "struct pbkdf2_req" in node_crypto.cc like RandomBytesRequest?
20:20:03  * vptrquit (Ping timeout: 260 seconds)
20:28:50  <bnoordhuis>trevnorris: yep
20:29:05  <trevnorris>bnoordhuis: coolio. thanks :)
20:33:57  * CoverSlidejoined
20:34:53  * CoverSlidequit (Client Quit)
20:35:01  * CoverSlidejoined
20:36:52  * defunctzombiechanged nick to defunctzombie_zz
20:45:55  * vptrjoined
20:46:06  <wolfeidau>tjfontaine: I have verified that v0.10.20-pre fixes the problem for me, I also isolated a package.json that reproduces the issue every time on my machine https://gist.github.com/wolfeidau/6769988
20:46:29  <wolfeidau>simply running npm cache clean && rm -r node_modules && npm install
20:50:40  <trevnorris>bnoordhuis: um. i'm pretty sure there's a memory leak in crypto.randomBytes. I just realized that data_ isn't being free'd in the destructor. it's happening in v0.10 as well.
20:51:42  <tjfontaine>wolfeidau: thanks greatly
20:51:48  <trevnorris>anyone else mind taking a quick look at this? https://gist.github.com/trevnorris/6770078
20:52:19  <wolfeidau>tjfontaine: np I think it is best if also have a failing test case :)
20:52:55  <MI6>joyent/node: tjfontaine created branch v0.10.20-release - http://git.io/qiLT3A
20:53:09  <tjfontaine>wolfeidau: heh, I have a test I was using as well
20:53:18  <trevnorris>tjfontaine: wait. might have a memory leak fix that needs to go in there. mind taking a quick look at the gist?
20:53:35  <tjfontaine>trevnorris: ok, moment
20:53:35  <tjfontaine>while [ $RET -eq 0 ]; do
20:53:36  <tjfontaine> rm -rf ~/.npm node_modules
20:53:36  <tjfontaine> npm install --nodedir=$HOME/node jsontool workflow vasync restify
20:53:36  <tjfontaine> RET=$?
20:53:38  <tjfontaine>done
20:53:57  <wolfeidau>tjfontaine: haha nice
20:58:54  <tjfontaine>trevnorris: lots of slow buffers being created, in a tight loop, what are we expeting to see here?
21:01:21  <trevnorris>tjfontaine: um... wait. on sec. maybe i'm just nuts.
21:01:54  * bnoordhuisquit (Ping timeout: 264 seconds)
21:02:43  <trevnorris>tjfontaine: ok, refresh https://gist.github.com/trevnorris/6770078
21:03:19  <trevnorris>tjfontaine: you'll see max was 2.5 GB. v8 gc should be cleaning up those slow buffers regardless
21:03:24  <trevnorris>and it shouldn't be allowed to grow that much
21:04:03  <trevnorris>tjfontaine: just tried the same in v0.10. same memory result
21:07:41  <trevnorris>tjfontaine: just re-ran the test explicitely calling gc(), no change in memory usage.
21:07:47  * austojoined
21:09:50  <tjfontaine>trevnorris: ok, well
21:10:29  <trevnorris>and i mean, I called gc() on every 1000 loop, not just once.
21:11:02  <tjfontaine>ya, looking
21:14:14  <MI6>node-review: #89 UNSTABLE windows-ia32 (9/600) centos-x64 (1/600) centos-ia32 (2/600) windows-x64 (8/600) osx-x64 (1/600) smartos-x64 (2/600) http://jenkins.nodejs.org/job/node-review/89/
21:16:26  * groundwaterjoined
21:16:33  <trevnorris>tjfontaine: and fwiw I used my buffer-dispose module to manually dispose all the returned slow buffers. still no change in memory usage.
21:17:02  <tjfontaine>ok, well
21:17:58  <tjfontaine>the heap is growing
21:18:10  <tjfontaine>so it's probably a memory leak in node_crypto certainly
21:18:27  <tjfontaine>it's not an anon section, like mmap would be doing for the v8 heap
21:18:46  <tjfontaine>but it's being referenced/held in a way that's not a traditional ::findleak
21:24:48  * wwicksquit (Quit: I'm sure we'll meet again)
21:27:32  <trevnorris>ok. valgrind is picking up something is lost from crypto::RandomBytes
21:28:15  <tjfontaine>yes, it's almost certainly data_ leaking
21:28:18  * dominictarrquit (Ping timeout: 264 seconds)
21:28:22  <tjfontaine>I just put a check in the destrcutor
21:28:26  <tjfontaine>but it wasn't enough
21:30:53  <tjfontaine>hm, the destructor isn't running
21:31:19  <trevnorris>the destructor should be run from Buffer::New() (well, on the v0.10 branch)
21:31:33  <trevnorris>on v0.11 it uses Buffer::Use() and the memory is immediately free'd
21:31:42  * bradleymeckquit (Quit: bradleymeck)
21:31:49  <tjfontaine>no, not that one, the RandomBytesRequest destructor isn't running
21:32:13  <tjfontaine>the one that should fire in RandomBytesAfter or whatever
21:33:36  <trevnorris>well, RandomBytesFree is passed to Buffer::New(), which only deletes "data" but can you use a destructor on a struct in c++
21:33:44  <trevnorris>thought they were only useful in classes
21:33:55  <tjfontaine>classes are just structs
21:34:02  <trevnorris>eh, ok.
21:34:17  * dominictarrjoined
21:34:27  <tjfontaine>I'm not seeing my fprintf before the delete of req anyyway
21:34:34  <trevnorris>so "del req;" should run the destructor.
21:34:41  <tjfontaine>I'm not seeing anything hit the noop callback either, are we even using the api right here? :)
21:35:14  <trevnorris>tjfontaine: http://nodejs.org/api/crypto.html#crypto_crypto_randombytes_size_callback
21:35:22  <tjfontaine>I'm think we're just filling up the threadpool honestly
21:35:48  <trevnorris>ah yeah. it does call uv_queue_work for every call.
21:37:09  <tjfontaine>ya, we're just hammering the threadpool, cranking it up to UV_THREADPOOL_SIZE 128 and then iter of 1e3 things actually progress
21:37:21  <tjfontaine>this looks like notabug to me :)
21:37:45  <tjfontaine>if you needed to generate this much random data you'd want to use a pipeline of sorts
21:38:19  <tjfontaine>trevnorris: thoughts? may I continue the 0.10.20 release?
21:41:16  * groundwaterquit (Ping timeout: 245 seconds)
21:41:22  <trevnorris>tjfontaine: sec. think I might have a test
21:43:18  <trevnorris>tjfontaine: check this again: https://gist.github.com/trevnorris/6770078
21:43:35  <trevnorris>tjfontaine: the total mem used is dependent on the size of the size you request.
21:43:44  <trevnorris>i'd think if it was the thread pool it would stay constant
21:43:59  <tjfontaine>no, because the memory is allocated before the work starts
21:44:48  <tjfontaine>trevnorris: you see what I'm referring to?
21:45:02  <trevnorris>tjfontaine: pretty sure I do. just going to write a line of code to confirm it.
21:45:28  <tjfontaine>need to move `new char[size]` to RandomBytesWork if that's how you want it to work
21:46:33  <trevnorris>sorry, just waiting for a script to run.
21:47:35  <trevnorris>tjfontaine: so why is this so large? https://gist.github.com/trevnorris/6770078
21:48:21  <trevnorris>and again the final size varies fairly linearly w/ the size of the random bytes request.
21:51:05  * groundwaterjoined
21:51:13  <trevnorris>tjfontaine: was that test case to your point? saw what you were saying about all the allocations up front (which also seems not so good), but waiting until each callback is called should've alleviated some of that.
21:51:40  <tjfontaine>trevnorris: process.memoryUsage() seems fairly consistent
21:51:48  <tjfontaine>nothing particularly scary
21:52:29  <tjfontaine>comment added to your gist
21:52:55  <tjfontaine>there's some sway there, but doesn't feel like a full on leak
21:53:04  * rendarquit
21:55:05  <trevnorris>tjfontaine: what's UMEM_DEBUG?
21:55:35  <tjfontaine>a flag for me to turn on and see if there's C/C++ memory being leaked
21:55:50  <trevnorris>work on linux?
21:56:14  <tjfontaine>if you build libumem on linux and LD_PRELOAD it
21:56:38  <tjfontaine>well at least if you have a debugger that can inspect the information from a core
21:56:52  <trevnorris>cool
21:56:55  <tjfontaine>it's like getting valgrind without all the awful overhead
21:57:00  <tjfontaine>it's part of my blog post
21:57:01  <trevnorris>haha, awesome.
22:00:13  <trevnorris>tjfontaine: cool. thanks. was able to reproduce those conditions.
22:00:31  <trevnorris>tjfontaine: though, you think we should delay the allocation of memory until it's actually needed?
22:00:57  <trevnorris>tjfontaine: anyways, that's for later. no issues w/ the next release :)
22:01:11  <tjfontaine>trevnorris: so, I changed it gc and gcore every 1000 iterations, and then it went 11 times, over that time only 1mb of actual growth happened according to pmap
22:01:22  <tjfontaine>- total 42364K
22:01:23  <tjfontaine>+ total 43460K
22:01:58  <trevnorris>tjfontaine: yeah. i'm not disagreeing with you. was just able to reproduce that it's not "leaking"
22:01:59  <tjfontaine>trevnorris: right I already kicked the builds off with the belief that there wasn't anything actionable for this release anyway
22:02:45  <tjfontaine>trevnorris: delaying the allocs to when they're actually needed seems like a reasonable strategy in this case, but wanna make sure there's not some hole of multi-threaded allocs we fall into
22:03:18  <trevnorris>yeah... i'm not going there. very happy in my single-threaded world. thank you! ;)
22:03:28  * dominictarrquit (Ping timeout: 245 seconds)
22:03:51  <tjfontaine>:)
22:04:44  * dominictarrjoined
22:05:42  <MI6>joyent/node: tjfontaine created tag v0.10.20 - http://git.io/AzsTXA
22:06:19  <MI6>joyent/node: Timothy J Fontaine v0.10 * a63079f : blog: Post for v0.10.20 (+3 more commits) - http://git.io/HK-vGA
22:08:47  * AvianFluquit (Ping timeout: 260 seconds)
22:10:03  * vptrquit (Ping timeout: 248 seconds)
22:15:23  * dominictarrquit (Quit: dominictarr)
22:18:16  <MI6>nodejs-v0.10: #1510 UNSTABLE linux-ia32 (3/600) linux-x64 (2/600) smartos-x64 (1/600) osx-x64 (1/600) http://jenkins.nodejs.org/job/nodejs-v0.10/1510/
22:23:55  * Kakeraquit (Ping timeout: 248 seconds)
22:29:07  <MI6>node-review: #90 UNSTABLE windows-ia32 (8/600) osx-ia32 (1/600) centos-x64 (2/600) centos-ia32 (1/600) windows-x64 (8/600) smartos-x64 (2/600) http://jenkins.nodejs.org/job/node-review/90/
22:29:16  <MI6>nodejs-v0.10-windows: #236 UNSTABLE windows-ia32 (9/600) windows-x64 (6/600) http://jenkins.nodejs.org/job/nodejs-v0.10-windows/236/
22:38:54  * c4miloquit (Remote host closed the connection)
22:39:23  <othiym23>good job, tjfontaine!
22:39:42  <othiym23>you have solved a major PITA for many people!
22:39:53  <tjfontaine>othiym23: well, indutny did, without necessarily realizing it :)
22:40:41  <tjfontaine>othiym23: I wish I had isaacs here to fix npm itself :/
22:40:56  <othiym23>just gotta shelter in place
22:41:01  <othiym23>when is isaacs back?
22:41:12  <tjfontaine>he travels back this week, his work availability is monday
22:42:46  <tjfontaine>I just wanted to have a better root cause for npm itself, but I couldn't let another day go by without releasing a version of node that was satisfactory
22:42:50  <trevnorris>i might have just gone a tiny bit ape shit on someone
22:43:29  <tjfontaine>watching @davglass do his live coding at openhack, he types `npm init` and I see 0.10.19 and I flip the fuck out, and grab coworkers arm while he `npm install`s
22:43:46  <tjfontaine>trevnorris: oh that thread
22:44:09  <trevnorris>hahaha, you get excited easily. or are there seriously that many people that far behind?
22:44:17  <trevnorris>i don't keep track of these things.
22:44:30  <trevnorris>and yeah. _that_ thread. :P
22:45:04  <tjfontaine>trevnorris: he was telling people to go download and install node to use for this open hack event, and all I could see where 500 people `npm install`ing and getting that ridiculous error
22:45:26  <tjfontaine>so while he was busy with his live coding experiment, I opened up my laptop and rolled the website back to 0.10.18 as the latest
22:45:46  <trevnorris>tjfontaine: ahhh. got it. you saw 0.10.19 and saw errors flying across everyone's screen.
22:46:14  <trevnorris>wow, quick on your keyboard. nice thinking though
22:46:15  <tjfontaine>ya, it didn't happen, at least so far as I know, but I was worried it woul dhappen
22:46:30  * trevnorrissalutes tjfontaine for his good efforts at keeping the community happy
22:47:02  <tjfontaine>now I have to investigate something someone said about zlib and npm installs on master :)
22:47:21  <trevnorris>hehehe ;)
22:48:09  <trevnorris>tjfontaine: i'm having two problems. on is that ZCtx just aborts, the other is that i get more than a few invalid hashs
22:48:25  <tjfontaine>well that's likely an off-by-one in crypto streams
22:49:05  <tjfontaine>do you have a package that it always happens for?
22:49:24  <trevnorris>was just trying that out now. let me get one for you
22:51:42  * EhevuTovjoined
22:52:22  <trevnorris>tjfontaine: seems to happen when i am installing a bunch of things at once. so here's what I do:
22:52:58  <trevnorris>rm -rf node_modules ~/.npm/; /path/to/build/node $(which npm) --dev --nodedir=/path/to/build/ install newrelic
22:53:08  <tjfontaine>heh
22:53:17  <trevnorris>that'll usually cause the following to happen somewhere in the process:
22:53:23  <trevnorris>../src/node_zlib.cc:101: void node::ZCtx::Close(): Assertion `!write_in_progress_ && "write in progress"' failed.
22:56:09  <trevnorris>othiym23: fyi, ben did a partial review today. so far he likes how I did stuff, so going to fix pbkdf2. that's the last place w/o visibility
22:56:26  <othiym23>oh Jorge
22:56:35  * loladiroquit (Quit: loladiro)
22:56:38  <othiym23>trevnorris: saw the review, glad to see it's been so painless so far
22:57:02  <trevnorris>yeah. it was only a partial review. he's probably just sleep deprived from the new baby. ;P
22:57:13  <othiym23>also it's Monday
22:57:18  <trevnorris>yeah, that too.
22:57:31  <tjfontaine>is it monday?
22:57:37  <tjfontaine>looks like someone has a case of the mooooondays
22:57:44  <othiym23>I seem to have gotten in this pattern of waking up exhausted on Saturday after having worked like an asshole all week, and then waking up exhausted on Monday after having worked like an asshole all weekend
22:57:57  <trevnorris>heh
22:58:05  <tjfontaine>https://www.youtube.com/watch?v=nREz7wuafho
22:58:07  <othiym23>but I have shit in beta now, so progress is inching along
22:58:33  <tjfontaine>I got my email invite a bit ago, so time for me to play I guess :)
22:58:54  <othiym23>tjfontaine: if you spot obvious stupidity, pleeeeeeeease let me know
22:58:57  <tjfontaine>nod
22:59:10  <trevnorris>beta?
22:59:41  <othiym23>of New Relic for Node
23:00:01  <trevnorris>ah, ok.
23:00:18  <othiym23>it's no different than what we have on npm right now, although I've got a version with a way better MongoDB tracer and some better EE tracker probably this evening
23:00:33  <trevnorris>does it work on master?
23:00:59  <othiym23>let me put this carefully: all the tests (except for a couple Restify-related ones) pass
23:01:10  <trevnorris>coolio.
23:01:16  <othiym23>I have not actually run any apps on master with it yet
23:01:27  * loladirojoined
23:01:29  <trevnorris>heh, probably won't work. :P
23:03:10  <othiym23>I actually don't know why it wouldn't -- for once, domains hasn't totally changed between major versions
23:03:27  <trevnorris>i mean, the modules you're trying to run it against.
23:04:22  <othiym23>I haven't actually investigated why Restify fails, but I'm cautiously optimistic that the problem is mcavage's and not mine
23:04:30  <tjfontaine>the way trevor talks you'd think master was utterly broken :)
23:04:58  <trevnorris>hah. well, just trying not to be too optimistic. :)
23:05:15  <trevnorris>but possibly it's coming across as pessimism.
23:05:45  * AvianFlujoined
23:06:57  <othiym23>after what a PITA it was getting my tests to pass on 0.10 (there's a whooooooole lot of them at this point), I was delighted it was so easy to get the tests running on #6011 / master
23:07:16  <trevnorris>but hey. when ben told me he liked how I converted the RandomBytesRequest struct to a class, and didn't have a single correction in my api implementation. first thing I thought was
23:07:16  <trevnorris>man, he must be high or something.
23:07:23  <tjfontaine>npm ERR! Error: shasum check failed for /root/tmp/npm-73569-KsIN10dp/1380582415473-0.029555255779996514/tmp.tgz
23:07:26  <tjfontaine>npm ERR! Expected: bae0026ef445580b582bffa7201068bf101691cb
23:07:28  <tjfontaine>npm ERR! Actual: fd28e6c5475d60969fc991a927ab21655b9a0f57
23:07:31  <tjfontaine>blah
23:07:43  <trevnorris>yey!!! reproduced. sort of.
23:08:51  <trevnorris>othiym23: but seriously owe you something for staying up those two nights figuring out those two stupid bugs.
23:10:18  <tjfontaine>hmm it seems in my test it's always failing on backoff module
23:11:44  <tjfontaine>not quite the same as trevnorris's but
23:11:45  <tjfontaine>Assertion failed: !write_in_progress_ && "write in progress", file ../src/node_zlib.cc, line 101, function Close
23:12:05  <trevnorris>tjfontaine: i get that one too. they'll both pop up. :)
23:12:21  <tjfontaine>wish I had a better test case
23:12:28  <tjfontaine>but I guess, MANTA TO THE RESCUE
23:12:47  <othiym23>a manta instance for each permutation of all the modules on npm
23:12:56  <othiym23>all the versions of all the modules on npm
23:12:59  <tjfontaine>haha
23:13:01  <othiym23>a good test for manta
23:13:28  <othiym23>Joyent has a cluster made out of pure computronium sitting at the core of Jupiter, right?
23:13:37  <tjfontaine>no-comment
23:13:46  <trevnorris>tjfontaine: /path/to/node $(which npm) --dev --nodedir=/path/to/ install backoff
23:13:46  <trevnorris>gave me both errors.
23:13:51  <trevnorris>two for the price of one!
23:14:06  <tjfontaine>heh, I haven't been quite that lucky
23:14:25  <tjfontaine>https://gist.github.com/tjfontaine/3ffea410e5edd3587d32
23:14:31  <tjfontaine>didn't mean for that to be secret, btu whatever
23:14:40  * hzquit
23:14:51  <tjfontaine>shame on people for using anonymous closures
23:15:03  <trevnorris>yeah. seriously.
23:15:14  <tjfontaine>anyway this is in core
23:15:21  <tjfontaine>because we're debug()'ing
23:16:32  <trevnorris>tjfontaine: here's mine: https://gist.github.com/trevnorris/6771701
23:17:12  <tjfontaine>enoent stuff is weird
23:17:29  <trevnorris>yeah. this issue is strangely complex.
23:17:57  <trevnorris>i'll generate a core, don't know why I haven't before.
23:18:15  <tjfontaine>anyway, at the least it seems like a zlib context is being reaped when the connection is still valid
23:19:00  <trevnorris>tjfontaine: hah, want all 7k lines of my npm debug log? :P
23:19:08  <tjfontaine>no thank you :)
23:19:13  <groundwater>tjfontaine: nice job on the cb() fix
23:19:38  <tjfontaine>groundwater: lets hope it never shows up again :)
23:19:59  <trevnorris>tjfontaine: yeah. so can't figure out why this one is also popping up: error Error: ENOENT, lstat '/var/...
23:20:04  <trevnorris>along with the core dump.
23:20:22  <tjfontaine>the core dump I get
23:20:55  * c4milojoined
23:22:06  <tjfontaine>it's not clear at the moment how it can be triggering a reap here though
23:22:42  <tjfontaine>when write_in_progress_ is set true, we also clear weak, which should make the gc not try and reap the object
23:22:55  <trevnorris>just ran npm with node_g, got: bit length overflow code 6 bits 6->7
23:23:08  <tjfontaine>you're on 64bit?
23:23:11  <trevnorris>yeah
23:23:18  <trevnorris>I get that every time.
23:23:39  <tjfontaine>curl https://github.com/v8/v8/commit/435d8aadc3f60fea2aaad23fae3f29e5aeccb907.patch | git am
23:23:43  <tjfontaine>and try again
23:24:37  <tjfontaine>well I guess really you'll need to .diff and -p1 anyway
23:24:37  <trevnorris>tjfontaine: that supposed to go on master?
23:25:18  <tjfontaine>(cd deps/v8 && curl https://github.com/v8/v8/commit/435d8aadc3f60fea2aaad23fae3f29e5aeccb907.diff | patch -p1); make
23:25:41  <trevnorris>ooh. cool.
23:26:48  <tjfontaine>see also https://github.com/joyent/node/issues/6295
23:27:15  <trevnorris>tjfontaine: noop. still get the bit length overflow
23:27:22  <tjfontaine>k
23:28:14  <tjfontaine>the mechanism for npm logging is just bizarre
23:28:47  <tjfontaine>npm ERR!
23:28:48  <tjfontaine>npm ERR! Additional logging details can be found in:
23:28:48  <tjfontaine>npm ERR! /root/tmp/npm-debug.log
23:28:48  <tjfontaine>npm ERR! not ok code 0
23:28:52  <tjfontaine>like, really, thanks.
23:29:00  <trevnorris>haha, yeah.
23:29:30  <trevnorris>well, the abort i'm getting must be a race condition. can't reproduce in debug mode of course. :P
23:29:35  <othiym23>as someone who is constantly asking people to send me the gzipped logs from something and getting all kinds of random crap *but* the gzipped logs, I totally understand the cowpath isaacs was paving there
23:29:38  <tjfontaine>trevnorris: heh
23:30:05  <tjfontaine>othiym23: yes, it's just the final "not ok code 0" that irks me
23:30:14  <trevnorris>and, "not ok code 0" doesn't make a lot of sense to me.
23:30:15  <tjfontaine>othiym23: along with not coallescing the failures
23:30:31  <tjfontaine>but streaming and all that jazz
23:30:50  <othiym23>my favorite is "weird exit code x" for whenever 'npm test' is running tap, which returns the number of failing test cases as the status code
23:30:53  <othiym23>like, that's not even weird
23:30:59  <tjfontaine>o0
23:31:01  <tjfontaine>la la la
23:31:03  <tjfontaine>I didn't just hear that
23:31:35  <trevnorris>whoa
23:31:53  <trevnorris>bit length overflow
23:31:53  <trevnorris>code 12 bits 6->7
23:31:53  <trevnorris>bit length overflow
23:31:53  <trevnorris>code 5 bits 5->6
23:31:53  <trevnorris>bit length overflow
23:31:54  <trevnorris>code 12 bits 6->7
23:31:56  <trevnorris>bit length overflow
23:31:58  <trevnorris>code 4 bits 6->7
23:32:00  <trevnorris>bit length overflow
23:32:02  <trevnorris>code 13 bits 7->6
23:32:04  <trevnorris>code 5 bits 5->6
23:32:06  <trevnorris>and so on and so on.
23:32:08  <tjfontaine>something is stomping on some memory there
23:32:34  <tjfontaine>I have a feeling I know what this is going to be, but I'm going to go ahead and formulate the manta job
23:34:20  <trevnorris>tjfontaine: i totally want to sneak in the back of the room for max's node specific training. that sounds awesome.
23:35:11  <tjfontaine>hehe
23:41:24  <trevnorris>tjfontaine: i've ran it over a dozen times. can't get the stupid thing to abort in debug mode. :P
23:41:30  <tjfontaine>nod
23:42:04  <trevnorris>hah, and first time out of debug mode.
23:43:45  <othiym23>sounds like you guys are having all the luck with the race conditions today
23:45:16  <tjfontaine>:<
23:49:00  * TooTallNatejoined
23:50:22  <trevnorris>well, this is confusing. ulimit -c unlimited, but still no core dump.
23:51:00  <tjfontaine>try adding --abort-on-uncaught or whatever it is
23:52:02  <trevnorris>tjfontaine: well, my terminal says "Aborted (core dumped)" but there's no core file.
23:52:48  <tjfontaine>oh maybe you have set it to dump next to the binary or in the prefix