00:00:00  <dap>yes, I think all of that is possible (and good ideas :)
00:00:29  <trentm>logrotate plugins?
00:01:24  <nahamu>maybe just trick logrotate to think that "mput" is a mailer...
00:02:16  <nahamu>http://linuxcommand.org/man_pages/logrotate8.html
00:02:59  <nahamu>I was thinking of the "postrotate" scripting, but perhaps you can't pass in a filename to that.
00:04:56  <nahamu>but yeah, a wrapper around mput could pretend to be the email program but actually upload log files about to be purged to manta.
00:07:52  <trentm>I see. Yah, good idea.
00:09:45  <rmustacc>Sounds like you kind of want something to send your rsyslog stuff to.
00:16:27  <nahamu>sure, that's the right thing to do.
00:19:43  * nfitchquit (Quit: Leaving.)
00:22:15  * trentmquit (Quit: Leaving.)
00:23:12  * fredkquit (Quit: Leaving.)
00:24:37  * ghostbarquit (Read error: Connection reset by peer)
00:25:32  * ghostbarjoined
00:31:32  * yunong1quit (Quit: Leaving.)
00:35:01  * mcavagequit
00:38:42  * dapquit (Quit: Leaving.)
00:50:01  * nfitchjoined
00:53:27  * chorrelljoined
00:57:09  * nfitchquit (Ping timeout: 248 seconds)
01:44:31  * abraxasjoined
01:57:50  * trevorojoined
01:58:39  * trevoroquit (Client Quit)
02:42:50  * nfitchjoined
02:46:53  * nfitchquit (Ping timeout: 240 seconds)
02:50:50  * mikealjoined
03:02:28  * mikealquit (Quit: Leaving.)
03:04:40  * mikealjoined
03:11:21  * mikealquit (Quit: Leaving.)
03:11:59  * mikealjoined
03:18:26  * mikealquit (Quit: Leaving.)
03:51:11  * chorrellquit (Quit: My iMac has gone to sleep. ZZZzzz…)
03:53:32  * bixuquit (Remote host closed the connection)
04:33:56  * mcavagejoined
04:33:58  * mcavagequit (Client Quit)
04:44:17  * yunongjoined
04:45:02  * yunongquit (Client Quit)
04:47:13  * ghostbarquit (Remote host closed the connection)
04:47:42  * ghostbarjoined
04:51:53  * ghostbarquit (Ping timeout: 240 seconds)
05:03:01  * mikealjoined
05:05:11  * mikealquit (Client Quit)
05:11:23  * ghostbarjoined
05:37:46  * nfitchjoined
05:39:22  * nfitchquit (Client Quit)
05:44:55  * mikealjoined
05:45:18  * mikealquit (Client Quit)
05:49:24  * mikealjoined
06:35:17  * aolson_joined
06:36:10  * aolsonquit (Read error: Operation timed out)
07:21:10  * mamashjoined
07:26:03  * mikealquit (Quit: Leaving.)
08:03:01  * mikealjoined
08:27:05  * ghostbarquit (Remote host closed the connection)
08:27:33  * ghostbarjoined
08:32:24  * ghostbarquit (Ping timeout: 276 seconds)
09:05:03  * bixujoined
09:09:26  * bixuquit (Ping timeout: 240 seconds)
10:53:34  * abraxasquit (Remote host closed the connection)
11:05:23  * bixujoined
11:09:26  * bixuquit (Ping timeout: 240 seconds)
11:45:31  * Vodjoined
13:00:23  * chorrelljoined
13:06:05  * bixujoined
13:10:23  * bixuquit (Ping timeout: 240 seconds)
13:18:49  * aolson_quit (Quit: Textual IRC Client: www.textualapp.com)
13:45:16  * mamashpart
13:45:25  * mamashjoined
13:49:18  * Vodquit (Quit: Vod)
14:11:52  * chorrell_joined
14:19:36  * Vodjoined
14:19:56  * fredkjoined
14:28:57  * Vodquit (Ping timeout: 264 seconds)
14:30:14  * chorrellquit (Quit: My iMac has gone to sleep. ZZZzzz…)
14:34:09  * fredkquit (Quit: Leaving.)
14:35:34  * mikealquit (Quit: Leaving.)
14:46:13  * mcavagejoined
14:49:08  * Vodjoined
15:26:35  * mikealjoined
15:52:32  * fredkjoined
15:58:17  * mamashpart
16:00:06  * mikealquit (Quit: Leaving.)
16:12:38  * dapjoined
16:18:53  * yunongjoined
16:27:18  * trentmjoined
16:27:58  * Vodquit (Quit: Vod)
16:28:49  * mamashjoined
16:36:07  * mikealjoined
16:38:45  * elijah-mbpquit (Ping timeout: 245 seconds)
16:39:28  * elijah-mbpjoined
16:45:33  * ghostbarjoined
16:46:29  * natefoojoined
16:47:50  <mcavage>mikeal: dap pointed out to me last night that there's an easier way than what I gave you yesterday to run image magick in manta: http://apidocs.joyent.com/manta/job-patterns.html#image-conversion
16:48:21  <mikeal>nice
16:48:28  <mikeal>i realized i'm going to have to write some code around this
16:48:43  <mikeal>so that i don't re-encode all the images ever every time we add new ones
16:48:45  <mcavage>to do the "migration" or the conversion?
16:49:05  <mikeal>and i realized it would be better to write them to new directories by their conversion rather than mutate the filename
16:49:12  <mcavage>yeah - in the absence of triggers you'll have to do some kind of queueing and/or dedup'ing client-side.
16:49:28  <mikeal>well, the migration is literally moving everything in a new bucket because i foolishly put all my images in /public :)
16:49:40  * dapquit (Quit: Leaving.)
16:49:51  <mcavage>you can just use mln fwiw, so you don't actually have to copy bytes around.
17:06:49  * bixujoined
17:11:16  * bixuquit (Ping timeout: 264 seconds)
17:15:55  * yunongquit (Quit: Leaving.)
17:18:48  * bixujoined
17:24:30  * nfitchjoined
17:44:58  * dapjoined
17:49:50  * mcavagequit (Remote host closed the connection)
17:50:22  * mcavagejoined
17:55:01  * mcavagequit (Ping timeout: 268 seconds)
17:57:12  * trevorojoined
17:59:50  * mcavagejoined
18:03:54  * yunongjoined
18:47:52  * mamashpart
18:52:34  * bixuquit (Remote host closed the connection)
18:53:01  * bixujoined
18:57:09  * mcavagequit (Remote host closed the connection)
19:10:46  * trevoroquit (Quit: leaving)
19:13:12  * mikealpart
19:21:18  * bixu_joined
19:21:58  * bixu_quit (Remote host closed the connection)
19:22:24  * bixu___joined
19:23:18  * bixuquit (Ping timeout: 256 seconds)
19:33:07  * mcavagejoined
19:55:18  * mamashjoined
20:31:25  * mamashpart
20:34:01  * mamashjoined
20:49:03  * mamashpart
20:52:08  <mjn>possibly something that doesn't make sense to do, but: is there a way to make a file (or dir) public *on manta* but not publicly retrievable via HTTP?
20:52:41  <mjn>use-case is to let other people run manta jobs on some processed files from their own accounts, but not to let them curl -O them and rack up big bandwidth bills for me (esp. since they aren't the versions of the files they should be downloading if that's what htey want)
20:54:36  * chorrell_quit (Quit: Textual IRC Client: www.textualapp.com)
20:55:15  <mcavage>mjn: there's no such ability to do that right now.
20:55:35  <mcavage>that would be covered by the unicorn that is "access control"
20:56:15  <mjn>ah ok
20:56:18  <nahamu>tell someone in sales about how many terabytes of stuff you'd upload to manta if you could do that... :-P
20:56:25  <mjn>not a big deal, more of a 'would be nice'
20:56:46  <mcavage>nahamu: no, it's top of the list regardless ;).
20:56:53  <nahamu>mcavage: ah, excellent!
20:57:10  <nahamu>so it's a unicorn being actively chased.
20:57:43  <mjn>for a handful of colleagues i can jut tell them, hey, don't curl these, but there's a few cases where i'm thinking of building something that's a mixture of public-service and businessy on top of manta
20:58:03  <mjn>basically, here is how you query this data set if you sign up for a manta account, if that sounds too complex pay me $n and i'll run your query for you
20:58:08  <mjn>but pls don't download all my data :P
20:58:23  <mjn>(not in the sense of it being private, just i do't want to be a webhost for huge public datasets)
20:59:50  <mcavage>yeah, this is basically the same class of problem as "hot linking"
20:59:58  <nahamu>yeah, would be nice to be able to, e.g. add an ACL to ~~/stor/thoth allowing the thoth user to have access (the flip side of my suggestion last night about a service for accepting dumps into the thoth account)
21:00:12  <mcavage>i.e., only allow downloads when 'Referer' is X or SourceIP is X.Y.Z.0/24
21:00:12  <mcavage>etc
21:00:50  <nahamu>sounds like mjn wants to allow compute jobs but no downloads
21:01:16  <mcavage>yeah - which is a special case of that, but it's basically still "allow read WHERE ..."
21:01:27  <nahamu>"feel free to spend your own compute cycles on my cheaply hosted data, but don't run up a bill for me by downloading it"
21:01:32  <mjn>yeah, limiting it to joyent IPs would be basically the same
21:01:55  <mjn>yeah, waht nahamu said
21:02:38  <mjn>there's a mild bit of dataset-pollution worry as well, like if have a chopped up and unix-ified version of a datast suitable for running manta-style queries on it, this isn't necessarily what someone should get if they want the canonical version
21:02:57  <mjn>they can pull it out via a manta query if they really want it, but i dont' want to make it easy for people who don't know what they're doing
21:03:06  <nahamu>"oh, and pay me $X/month to maintain your entry in the ACL ;)"
21:03:30  <mcavage>mjn: example? (of the "chopped up" thing)
21:03:34  <mcavage>or "worry" specifically?
21:04:23  <mjn>i guess worry might be too strong, but some datasets come in xml, and i have some internal versions processed into tab-delimited unix style
21:05:10  <mjn>i guess it's not necessarily *bad* to let those out, but osmehow setting myself up as an alternate provider of dataets (vs. the canonical source) feels like a stronger step than just puroviding a convenient compute path
21:05:16  <nahamu>just making it clear to people that this isn't the canonical dataset, it's a processed one, no warranty provided or implied.
21:05:35  <mjn>yeah that could be fine (but i still don't want to pay $0.12/gb for them to grab them :P)
21:06:48  <nahamu>well, sounds like the Joyent folks are hard at work on some sort of access control.
21:06:54  <mcavage>so, you want to offer the original (let's say XML) or the converted form (in your "hypothetical" business)?
21:07:04  <mjn>converted form
21:07:10  <mcavage>sorry if i'm being dense. I'm not following which you want to "offer" and which you want to "protect"
21:07:10  <nahamu>mcavage: original's might be government datasets
21:07:33  <mjn>i don't really want to protect them, that is a bit strong inr etrospect, it's more that i don't wnat to publicly set myself up as "mjn's alternate source of bioinformatics|whatever data"
21:08:05  <nahamu>*originals
21:08:23  <nahamu>mcavage: ignore charging for a moment.
21:08:46  <mcavage>ok - so basically what's going to come (although you know, not like in the next 3 weeks...) is a rich access control mechanism (policy) which you'll be able to manage on directories/objects, and eventually users/groups such that you could just allow N parties access to individual things, and specify "conditions" (as opposed to a static ACL).
21:08:53  <mcavage>on top of that, you have "identity management".
21:09:07  <mcavage>or, will have i should say :)
21:09:14  <nahamu>if mjn as a clever and enterprising individual converts 10TB of xml grossness into beautiful TSV data ready for processing by manta jobs
21:09:17  <mcavage>such that you could create users/groups under your account that you control.
21:09:37  <mjn>yeah that should do everything i want, actually for some other much more hypotheticals currently
21:09:44  <mjn>so e.g. i could share some actually private data with other researchers by name
21:09:47  <mcavage>then it's basically up to you how you want to manage access on top of all that.
21:10:04  <mjn>"don't let people http this" just seemed like the easier case, since i didn't know acl was coming :)
21:10:18  <mjn>the no-http is more of a cost measure, actual acls would be interesting too though
21:10:34  <mcavage>well, really it solves all of it. think "ACL if ACL were SQL"
21:10:34  <mjn>because http gets billed to me, while someone processing my public data on manta gets billed to them
21:11:05  <mcavage>yep. that's if you want to require a joyent account model for "partners" to process your data. most "businesses" typically want to just encapsulate everything.
21:11:15  <mcavage>i.e., your users, they go through some "gateway" you maintain.
21:12:03  <mjn>yeah, fro a business perspective that'd make sense, this is more me attempting to be nice and provide 'free' access if someone is willing to do the work, since i'm a university researcher and that kind of collegiality is semi-expected
21:12:16  <mjn>so i'd provide instructions for how they can sign up for their own account and do it at cost, if they want
21:12:35  <mcavage>right - so for "academics" and "researchers" and anybody else in that class really this is the model you'd want - require a joyent account, as long as the bill is on them you're happy.
21:12:53  <mjn>yeah
21:13:44  <mjn>i have in mind something like a three-tier approach: 1) sign up for your own joyent account and run your own queries, here's how; 2) pay me i fyou're a business or a funded project and i'll do it for you; 3) ask me if you can't do it yourself and you're a non-funded researchers, and i'll run it pro-bono on a case by case basis
21:14:02  <mjn>*typos
21:14:45  <mcavage>yeah - we've also talked about doing a like "data marketplace" but that's really a unicorn ;)
21:15:00  <mcavage>but i mentally note your request as falling in that bucket ;)
21:15:07  <mjn>fwiw i have no idea if this is a good 'business model', but it's a working attempt to balance service to the community andn ot paying out of pocket for everyone in the world :P
21:15:22  <mjn>(since i don't really get a university budget for such stuff unless it's tied to a specific funded project)
21:15:38  <mcavage>right - well, it's fine for you as a "business model" where "business" == "researchers" ;)
21:15:52  <mcavage>it's a terrible hurdle if you're trying to sell social blah blah.
21:15:56  <mjn>yeah
21:16:43  <mjn>it might even be good for y'all if i do manage to get something like that up, since a page of dtailed instructions for how to e.g. query a bioinformatics database 'at cost' in parallel might get some people signing up for manta
21:16:58  <mjn>even if half give up b/c they aren't technical and don't know much about unix, some might persevere
21:17:23  <mcavage>yes, we would love it ;)
21:17:35  <mjn>i guess in the future you could do that in-house by just hosting a bunch of public datasets with "here's how you query them, sign up today!"
21:17:50  <mjn>but so far my experience is they require annoyingly fiddly processing
21:17:53  <mcavage>yeah - public datasets are on somebody's list.
21:17:58  <mjn>especially xml stuff is not always obvious how to turn into tabular data
21:18:31  <mjn>and i can't stomach running xstl quries on it or something
21:18:35  <mcavage>yeah. at least for json we have "jsontool" to make it shell friendly. if this is common enough an xml2shell thing would be nice.
21:18:44  <mjn>*xslt
21:18:50  <mcavage>blech
21:19:11  <mjn>there's xml2, which is a curious serializer: http://www.ofb.net/~egnor/xml2/
21:19:35  <mjn>it turns xml into a weird prefixed line-oriented output, like <foo><bar>baz</bar></foo> becomes '[email protected] baz' or something like that
21:20:03  <mjn>but in my tests it's like 10x slower than writing horrible special-case c code to get what you really want out of the xml file, thn using regular unix tools on the result
21:20:04  <mcavage>that sounds suboptimal.
21:27:17  <trentm>I'll take kickbacks for a `xml` tool based on expat
21:30:32  * trentmquit (Quit: Leaving.)
21:32:48  <dap>mcavage, mjn: Instead of saying "don't make this accessible over HTTP", would it be reasonable to say "make this accessible over HTTP only to authenticated users, and charge *them* for it, not me"? Like a 900 number?
21:33:42  <mcavage>it's reasonable, although I'm not sure it would be that popular (fwiw, S3 has this - it's called "caller pays", but I know of very few people using it).
21:34:19  <mjn>yeah, that'd be fine with me as a provider as well, but i would like mcavage guess that few peopel would bother to use it
21:34:32  <dap>Yeah, fair. Equivalently, "people can make snaplinks to this, but not fetch it"
21:35:00  <mcavage>that's solved by access control.
21:35:23  <dap>Cool. Then maybe that's the way to do it. "You can fetch this over HTTP, by making your own snaplink first.
21:35:23  <dap>"
21:36:24  <mcavage>the policy based thing would basically be "allow $identity to perform $action on $resource where $arbitrary_constraints". $action = snaplink.
21:36:54  <dap>Yeah. Makes sense.
23:15:33  * mikealjoined
23:16:13  <mikeal>using the node.js client, how do i pass the input objects to job creation?
23:20:37  <mcavage>mikeal: https://github.com/joyent/node-manta/blob/master/test/client.test.js#L223-L233
23:21:03  <mcavage>you call that API as many times as you like and then ultimately call: https://github.com/joyent/node-manta/blob/master/test/client.test.js#L260
23:22:23  <mcavage>we should probably make a JobStream or something like that for this. i.e., foo.pipe(client.createJob());
23:22:26  <mikeal>ahhhhh
23:22:52  <mikeal>would be cool if createJob returned a writable stream you could just pipe to
23:22:54  <mcavage>but for now - that's the api - each invocation of that is an API round trip, so if you have lots, chunk up your inputs into blocks of 1k or something (that's what mjob does)
23:23:02  <mcavage>yeah - i'll file a GH issue for it.
23:23:18  <mikeal>well, in that case, it's probably better to just pass this an array then
23:24:04  <mcavage>https://github.com/joyent/node-manta/issues/102
23:24:08  <mcavage>yeah- you give it an array.
23:24:43  <mcavage>just saying - if you give it one at a time it's going to hurt if you have a lot of inputs. much faster to give it a managable list at a time.
23:25:10  <mikeal>is createJob('ls', {id:'myname'}) how i name the job explicitely?
23:26:21  <mcavage>no, you'd want it to look like what's in the comments: https://github.com/joyent/node-manta/blob/master/lib/client.js#L1285-L1317
23:26:32  <mcavage>that createJob('ls') is just sugary short-hand
23:26:49  <mcavage>side point: apologies the SDK docs aren't online. I'll get on that again.
23:27:52  <mcavage>so you'd want client.createJob({name: 'foo', phases: [ { exec: 'ls' }] }, function (err, job) { ... });
23:37:31  <mcavage>mikeal: ok, the node docs are now formatted: https://apidocs.joyent.com/manta/nodesdk.html
23:42:00  <mikeal>nice
23:42:19  * dapquit (Quit: Leaving.)
23:45:50  * dapjoined
23:57:21  * AvianFlujoined
23:58:08  <mikeal>i tried {name and it still gives me back a random uuid as the id