00:03:53  <hij1nx>no9 its on its way :) been really really busy -- but i have some help from dave pacheco, so i'll push it out soon
00:04:08  <hij1nx>FYI anyone logging anything might like this: https://github.com/hij1nx/logmap
00:04:48  <chilts>yeah, saw your conversation with lloyd, your module is nice - small tools for small things :)
00:31:56  * thl0joined
00:48:34  * st_lukequit (Remote host closed the connection)
01:36:40  * mcollinaquit (Read error: Connection reset by peer)
02:00:56  * timoxleyquit (Quit: Computer has gone to sleep.)
02:13:20  * timoxleyjoined
02:29:48  <thl0>is level API to be 100% compat with levelup API?
02:30:44  <thl0>asking cause when I switched to level chained batch broke (i.e.: https://github.com/thlorenz/level1/blob/master/samples/keys-01.js#L10)
02:31:19  <thl0>seems like chained batch wasn't exposed at all - error: WriteError: batch() requires an array argument
02:32:05  * mcollinajoined
02:38:26  <rvagg>eeek
02:38:49  <rvagg>double-check your level version, is it 0.9?
02:39:04  <rvagg>oh crud, 0.8 is in npm
02:39:51  * rvaggfixes
02:39:55  <thl0>rvagg: np, using levelup with leveldown now - just wondering
02:40:12  <rvagg>no idea why 0.9 wasn't published, must have been too much going on at the time and I slipped up
02:40:46  <thl0>let me know when the fix is in I'm gonna give it another spin and report back ;)
02:42:47  <rvagg>published
02:43:40  * thl0installing
02:44:43  <thl0>works, thanks
02:45:26  <thl0>rvagg: btw if I was going to store npm modules in my db and want to match a search query to keywords (an Array)
02:45:34  <thl0>would you suggest just using level?
02:45:52  <thl0>or should I go a bit higher i.e. mapped-index?
02:47:23  <rvagg>you could give it a go and built it yourself rather than using mapped-index
02:47:35  <rvagg>mapped-index is pretty simple tho, there's also level-index which is even simpler
02:47:41  <thl0>cool will do that
02:47:56  <thl0>not sure how I'd derive the key though
02:48:15  <thl0>i.e. if the keywords were [ "level", "db", "store" ]
02:48:49  <thl0>since it's not just one value I'd index by (assuming that all keywords are of same importance)
02:49:30  <rvagg>yeah, so what you want to do is look up by keyword and you'll have multiple entries per keyword so they need to be unique so append the package name after it cause that'll be unique
02:49:43  * ralphtheninjaquit (Ping timeout: 264 seconds)
02:49:47  <rvagg>!index!keyword!level!package1
02:49:49  <rvagg>!index!keyword!level!package2
02:49:54  <rvagg>!index!keyword!db!package2
02:49:59  <rvagg>!index!keyword!store!package3
02:50:14  <thl0>got it, thanks
02:50:28  <rvagg>then, do a readstream starting at !index!keyword!level! and ending at !index!keyword!level!\xff and you'll get em all!
02:50:43  <rvagg>'!' perhaps not being the best choice, use \x00 or \xff probably
02:50:49  * timoxleyquit (Quit: Computer has gone to sleep.)
02:51:17  <thl0>kinda like I have one collection just storing package names and another one keyed by package name that has actual package info?
02:51:20  <rvagg>that's effectively what mapped-index automates for you
02:51:41  <rvagg>yeah, that's right
02:51:54  <thl0>thanks, wanna do this by hand as much as possible, so I get a better understanding
02:51:59  <rvagg>so the indexes are just redirections to the actual packages, the values should probably be the package name or whatever the primary key is
02:52:12  <thl0>yep, understood
02:52:27  <rvagg>look in the code of level-mapped-index, it's quite short cause map-reduce does most of the word
02:52:31  <rvagg>s/word/work
02:53:58  <rvagg>just be conscious of potential overlaps
02:54:10  <rvagg>like 'level' vs 'leveldb', when you search for 'level' you shouldn't allow it to find 'leveldb'
02:54:23  * timoxleyjoined
03:12:57  * thl0quit (Remote host closed the connection)
03:33:18  * eugenewarejoined
04:22:33  * werlejoined
04:23:49  * timoxleyquit (Quit: Computer has gone to sleep.)
04:40:25  * werlequit (Quit: Leaving.)
06:11:09  * werlejoined
06:12:18  * werlequit (Client Quit)
07:01:44  * timoxleyjoined
07:17:16  * ralphtheninjajoined
07:37:30  * timoxleyquit (Quit: Computer has gone to sleep.)
07:45:57  * timoxleyjoined
07:55:07  * Pwnnaquit (Ping timeout: 264 seconds)
08:11:50  * eugenewarequit (Quit: Leaving.)
08:23:47  * wolfeidauquit (Remote host closed the connection)
08:26:09  * wolfeidaujoined
08:50:19  * ChrisPartridgequit (Ping timeout: 264 seconds)
09:29:29  * weetabeexjoined
09:29:34  <weetabeex>morning
09:31:54  <weetabeex>an user found a project (simpleleveldb) claiming that long-running leveldb processes tend to result in unbounded memory growth due to the MANIFEST file never being compacted; he was unable to find any confirmation on this after looking on the internets; is anyone able to confirm or deny this? :)
09:37:58  * tntjoined
09:42:48  <rvagg>weetabeex: not from my experience, over time leveldb actually trims up its memory and gets more compact
09:43:16  <weetabeex>tnt, ^
09:43:36  <weetabeex>rvagg, kay, thanks
09:44:27  <tnt>what about the disk space used by LOG and MANIFEST ? is that also trimmed from time to time ?
09:44:44  <rvagg>LOG is constantly compacted, that's a continual process
09:45:20  <rvagg>MANIFEST, I'm doubtful there's enough data in there to be a problem unless you have some really massive data set
09:45:45  <tnt>LOG being the text file, not the xxxx.log binary file
09:45:49  <rvagg>memory usage does initially grow, quite large, but then over time as the levels grow, memory usage becomes much more efficient
09:46:31  <tnt>MANIFEST is ~ 26M at the moment, after 5 days. It's not too concerning right now, but when restarting, it gets back to like 500k ...
09:47:19  <rvagg>tnt, are you using leveldb direct? try running a CompactRange and see how it goes
09:47:41  <weetabeex>rvagg, he's using it indirectly, via one of ceph's components
09:47:46  <rvagg>ok
09:47:54  <rvagg>btw, I did this recently: https://github.com/rvagg/node-leveldown/blob/master/test/leak-tester.js
09:48:16  <rvagg>it only ends up with a 20GB database, but over time memory usage of LevelDB + LevelDOWN + Node ends up ~ 130MB
09:48:25  <weetabeex>we shifted to leveldb a couple of releases back and we've been having some issues with leveldb growth, although it's getting slightly under control via explicit compacts at times
09:49:37  <rvagg>weetabeex: what write_buffer_size did you end up with?
09:49:54  <weetabeex>64 MB I think
09:49:55  <weetabeex>let me check
09:50:19  <weetabeex>32 MB
09:50:33  <weetabeex>64KB block size
09:50:47  <rvagg>mm, that's still large and could explain some of your memory usage issues, but for your use case it'd have to come down to benchmarks to see what the best values are
09:51:17  <tnt>well, right now memory usage isn't as much a concern as the disk usage.
09:51:24  <weetabeex>rvagg, we haven't seem much complaints about memory usage lately
09:51:29  <weetabeex>main issue is disk usage
09:51:40  <tnt>basically I'm trying to understand the 'jumps' http://i.imgur.com/D17beYv.png
09:51:56  <rvagg>tnt: ahh, ok, disk space, so there are some outstanding issues with leveldb & disk space and dodgy compactions, are you on the leveldb google group or do you watch the issues list?
09:52:15  <weetabeex>until a couple weeks ago we would see a store growing up to a dozen GBs while only containing couple hundred MBs worth of data
09:52:53  <weetabeex>rvagg, I'm not; not sure if anyone else on the project is, but I doubt it
09:52:54  <tnt>weetabeex: well ... that one is explained and not quite due to leveldb :p
09:53:09  <weetabeex>tnt, oh, not that one
09:53:12  <weetabeex>before then
09:53:20  <tnt>Ah ok, nm.
09:53:29  <weetabeex>mike dawson, iirc, was hitting a 32GB store
09:53:33  <rvagg>you should subscribe to the google group, it's not noisy but there are occasionally issues that come up with stuff not compacting properly, they're working on it but that could explain some of your issues
09:53:40  <weetabeex>sage then started compacting at times
09:53:54  <rvagg>CompactRange() might be interesting to try tho, in theory it should do what leveldb is doing when you re-open the store
09:53:59  <weetabeex>rvagg, kay, will do
09:54:30  <weetabeex>rvagg, we're compacting with range [NULL, NULL] and with explicit ranges whenever we delete batches of keys
09:54:35  <weetabeex>err
09:54:45  <weetabeex>[NULL, NULL] at spurious moments in time
09:55:02  <rvagg>weetabeex: also, just as an experiment you may want to substitute in Basho's leveldb fork, it has quite a few "optimisations" that they claim make it better for a server environment: https://github.com/basho/leveldb
09:55:17  <rvagg>it should be a relatively straightforward procedure of substituting it in, but there are a few more tunables
09:55:37  <weetabeex>rvagg, any idea if that's available via ubuntu's repositories?
09:55:44  <rvagg>weetabeex: doubt it very much
09:55:47  <weetabeex>yeah
09:55:58  <rvagg>weetabeex: it's main use is just in eleveldb which is for riak
09:56:03  <weetabeex>I'll mention it to the guys though
09:56:26  <rvagg>yeah, do that, their over-time benchmarks suggest that they're able to push a lot more data in a lot quicker than vanilla leveldb
09:57:18  <rvagg>multiple compaction threads and a more relaxed overlapping-key policy for the lower levels; but that's all mostly about throughput, that seems to be what they care about -- but that may suit your use-case too
09:58:10  <rvagg>I haven't got any benchmarks on it but I'm planning on doing so and making it available as an alternative to Node leveldb users
10:01:00  <weetabeex>rvagg, cool, if only I was able to trigger the same behavior that tnt sees in his cluster, testing this would be much, much easier
10:01:15  <rvagg>aye, it can be quirky
10:02:53  * timoxleyquit (Quit: Computer has gone to sleep.)
10:23:03  * mcollinaquit (Read error: Connection reset by peer)
10:37:40  * mcollinajoined
10:39:25  * mcollina_joined
10:42:09  * mcollinaquit (Ping timeout: 245 seconds)
12:13:33  * levelbotjoined
12:13:33  <levelbot>[npm] [email protected] <http://npm.im/levelplus>: Adds atomic updates to levelup database (@nharbour)
13:46:10  <rvagg>not a bad collection of stuff in there: https://github.com/nharbour/levelplus/blob/master/index.js
13:46:28  <rvagg>shame it's just not a bunch of separate packages, or perhaps a base package for the lock stuff with stuff that builds on it
13:59:55  * timoxleyjoined
14:01:37  * levelbotquit (Remote host closed the connection)
14:01:58  * levelbotjoined
14:18:07  * timoxleyquit (Quit: Computer has gone to sleep.)
14:19:48  * timoxleyjoined
14:22:55  * no9joined
14:28:33  * thl0joined
14:33:05  * thl0quit (Ping timeout: 248 seconds)
14:43:21  <levelbot>[npm] [email protected] <http://npm.im/homer>: Dynamic DNS server (@nharbour)
14:45:52  <levelbot>[npm] [email protected] <http://npm.im/homer>: Dynamic DNS server (@nharbour)
14:49:49  <levelbot>[npm] [email protected] <http://npm.im/homer>: Dynamic DNS server (@nharbour)
14:52:20  * thl0joined
15:00:14  * thl0quit (Remote host closed the connection)
15:01:04  * thl0joined
15:05:43  * thl0quit (Remote host closed the connection)
15:20:42  * thl0joined
15:28:01  * werlejoined
15:37:02  * werlequit (Quit: Leaving.)
15:45:54  <levelbot>[npm] [email protected] <http://npm.im/homer>: Dynamic DNS server (@nharbour)
15:46:43  * tntpart
16:02:49  * levelbotquit (Ping timeout: 246 seconds)
16:04:15  * timoxleyquit (Quit: Computer has gone to sleep.)
16:06:08  * levelbotjoined
16:06:11  <levelbot>[npm] [email protected] <http://npm.im/homer>: Dynamic DNS server (@nharbour)
16:06:40  * werlejoined
16:34:21  * werlequit (Quit: Leaving.)
16:41:32  <levelbot>[npm] [email protected] <http://npm.im/homer>: Dynamic DNS server (@nharbour)
17:12:17  <thl0>creationix: I totally see where you are coming from, it's just that I find isolation into repos more important and worth facing some challenges
17:17:39  * thl0quit (Remote host closed the connection)
18:14:39  * mcollina_quit (Ping timeout: 245 seconds)
18:32:19  * mcollinajoined
18:38:56  * thl0joined
19:10:11  * st_lukejoined
19:25:38  * Pwnnajoined
19:43:35  * dominictarrjoined
19:48:04  <thl0>rvagg: continuing our disc. from yesterday ...
19:48:26  <thl0>do the indexes live in a separate db from the values they are indexing?
19:56:01  <levelbot>[npm] [email protected] <http://npm.im/lev>: commandline and REPL access for leveldb (@hij1nx)
20:01:00  <thl0>so indexes in separate db or not? anyone? hij1nx juliangruber Raynos
20:02:20  <juliangruber>thl0: why would you put them into a seperate db?
20:02:37  <thl0>juliangruber: for context: https://github.com/thlorenz/level1/blob/master/samples/index-keywords.js
20:02:49  <thl0>so my vehicles are indexed here in a db
20:03:04  <juliangruber>what about level-sublevel
20:03:08  <juliangruber>so it feels like a seperate db
20:03:19  <thl0>but they may have properties, i.e. where would I store { key: car, value: 'vehicle invented ...' }
20:03:52  <thl0>juliangruber: trying to just use levelup interface for now to gain better understanding
20:04:44  <thl0>juliangruber: so can I create multiple collections in one db and how?
20:05:56  * thl0looks into sublevel
20:07:04  <thl0>juliangruber: I may not need sublevel since I don't mind having things in separate dbs
20:07:14  <thl0>just wanted to see what's the common way to do this
20:11:18  <juliangruber>yeah, that's sublevel
20:11:19  <juliangruber>:P
20:14:02  <dominictarr>heh, just realized that bitcoin uses leveldb!
20:14:55  * no9quit (Ping timeout: 264 seconds)
20:14:59  <dominictarr>thl0: if you put indexes in a single db, you can update them atomically
20:15:22  <dominictarr>i.e. update an index with a batch op when you update a key
20:15:42  <dominictarr>this way, you know that your indexes are always consistent with your data!
20:15:55  <thl0>dominictarr: thanks got it, so this is a common way to do it then
20:16:15  <dominictarr>yup
20:16:29  <thl0>i.e. it's normal to have one data set presented/indexed across multiple dbs
20:16:58  <dominictarr>not sure what you mean by "normal"
20:17:34  <dominictarr>I gotta go eat. will be back in a bit
20:17:35  <thl0>normal as in that is how leveldb peeps do it ;)
20:17:41  <thl0>ok
20:17:49  <st_luke>why would you do more than one db? how much data are you dealing with where that would be appropriate?
20:17:52  <thl0>ping me plz, got more questions
20:18:03  <thl0>st_luke: the entire npm registry
20:18:13  <thl0>i.e. all packages indexed by name/keyword
20:18:38  <st_luke>just model your keys appropriately dude
20:19:34  <thl0>st_luke: working on it: https://github.com/thlorenz/level1/blob/master/samples/index-keywords.js
20:19:38  <thl0>does that look about right?
20:20:18  <thl0>st_luke: so what would be a way to use only one db and why is that better?
20:20:45  <st_luke>i dont really have time to talk about it in depth right now but we shoudl talk about it another time
20:21:05  <thl0>so the idea is that I'll have all module names indexed by keywords and keep actual info somewhere else - ok
20:21:26  <thl0>I'll keep playing with it hopefully things become clearer
20:22:43  <st_luke>laterz
20:22:44  * st_lukequit (Remote host closed the connection)
20:26:32  <levelbot>[npm] [email protected] <http://npm.im/level-store>: A streaming storage engine based on LevelDB. (@juliangruber)
20:27:09  <Raynos>thl0: same db
20:27:14  <Raynos>different sub levels
20:27:28  <thl0>ok I'll use the sublevel module then
20:27:37  <Raynos>you can do sublevel manually
20:27:38  <Raynos>just prefix keys
20:27:49  <Raynos>with ~~INDEXES~INDEX_NAME~INDEX_KEY
20:27:50  <thl0>ah - cling
20:28:01  <thl0>another click - got it
20:28:15  <Raynos>collections are just prefixes
20:28:31  <thl0>thanks a lot finally understand
20:29:08  <thl0>I'll implement another version and push to github and link it when I'm done, then you can have a look (and criticise ;) )
21:06:32  <levelbot>[npm] [email protected] <http://npm.im/level-store>: A streaming storage engine based on LevelDB. (@juliangruber)
21:17:43  <dominictarr>thl0: the entire npm registry in a database isn't that big
21:20:31  <thl0>dominictarr: just the metadata - not too big I think - I pulled all of it it's about 20MB
21:20:46  <dominictarr>https://github.com/dominictarr/npmd
21:20:59  <dominictarr>^ that is my npm thing
21:21:04  <thl0>all user's metadata is about 16MB
21:21:07  <dominictarr>halfway there
21:21:15  <thl0>dominictarr: saw that
21:21:24  <dominictarr>thl0: how are you syncing it?
21:21:31  <thl0>but I can just pull it via a JSON request to couch
21:21:46  <dominictarr>but which route?
21:21:57  <thl0>dominictarr: trying to pull it, then analyze and add extra info regarding module quality
21:22:16  <thl0>dominictarr: all users: curl -k https://registry.npmjs.org/-/users/
21:22:19  <dominictarr>I'm using the changes feed, which dl's 500 mb last check
21:22:23  <thl0>all data curl -k https://registry.npmjs.org/-/all/
21:22:25  <thl0>I thinkg
21:22:55  <thl0>dominictarr: what url is that - I'll prob need that as well
21:23:21  <dominictarr>I wrote a module for it
21:23:25  <dominictarr>level-couch-sync
21:24:01  <thl0>but that just syncs the couch data into a leveldb - I need to analyze as well, but possibly I could use that to get all the data in the first place
21:24:07  <dominictarr>currently, npmd pulls all the modules, and keeps the latest readme, and the deps for each version.
21:24:19  <dominictarr>right
21:24:27  <thl0>maybe that'd be faster than via REST interface?
21:24:40  <dominictarr>well, actually, it allows you to map the data, so you can discard bits you don't want
21:25:03  <dominictarr>then, you feed it through a map-reduce or whatever selection of queries you want
21:25:14  <thl0>dominictarr: ah, ok - well let me get a bit more familiar with level and then I can optimize my solution
21:25:39  <dominictarr>like, I put all the readmes through level-inverted-index
21:25:43  <thl0>dominictarr: to give some context, this is what I'm trying to build: https://github.com/thlorenz/modurater
21:25:58  <dominictarr>which gives you full test search… but it needs some tuning
21:26:11  <dominictarr>oh, I remember - I saw that!
21:26:32  <thl0>I decided it would be a good reason to get into level ;)
21:26:42  <dominictarr>yes, definately!
21:26:54  <dominictarr>I wanted to work on this too, but got busy...
21:27:05  <thl0>dominictarr: we can join forces ;)
21:27:19  <dominictarr>we should
21:27:38  <dominictarr>if you use level-sublevel, that will be really easy.
21:27:55  <dominictarr>just need to figure out a good plugin pattern
21:28:03  <thl0>getting there - right now I'm still learning how to do it just with proper indexes like Raynos showed me
21:28:19  <thl0>trying to understand this stuff ya know ;)
21:31:53  <dominictarr>feel free to ask any questions!
21:32:17  <dominictarr>although, I'll have to answer them later - it's time I got some sleep...
21:34:31  <thl0>dominictarr: you bet I will :) - and yeah sleep is good
21:34:39  <dominictarr>night
21:34:58  * dominictarrquit (Quit: dominictarr)
21:40:16  <thl0>so Raynos I combined indexes and values into one db here: https://github.com/thlorenz/level1/blob/master/samples/index-keywords-and-add-values.js
21:40:46  <thl0>only downside is that you have to JSON.stringify all objects since the indexes have string values
21:40:57  <thl0>and the values are objects
21:45:21  <thl0>Raynos: gotta go, but if you get a chance I'd appreciate if you have a sec to look thru the code and let me know what I'm still doing wrong
21:46:20  * thl0quit (Remote host closed the connection)
21:47:33  <levelbot>[npm] [email protected] <http://npm.im/level-store>: A streaming storage engine based on LevelDB. (@juliangruber)
21:52:32  <levelbot>[npm] [email protected] <http://npm.im/level-store>: A streaming storage engine based on LevelDB. (@juliangruber)
22:00:02  <levelbot>[npm] [email protected] <http://npm.im/level-store>: A streaming storage engine based on LevelDB. (@juliangruber)
22:12:30  * timoxleyjoined
23:08:08  * mcollinaquit (Remote host closed the connection)
23:14:54  * thl0joined
23:23:20  * ChrisPartridgejoined