16:40:53  <cxe1>if LB is down and svc:/manta/application/registrar:default is offline what is the right ways to bring it back?
16:46:43  <nahamu>Are there some quick and simple sanity checks after I run deploy_coal_manta.sh to verify that Manta is generally happy?
16:51:58  <cxe1>yes
16:52:26  <cxe1>you wold need to deploy madtom which is nod deployed by default on CoaL
16:57:02  <cxe1>add madam image to the config-coal.jason and run manta-dam update madtom
17:02:53  <cxe1>rather: add madtom image to the config-coal.json and run manta-dam update madtom
17:11:01  <nahamu>where does the config-coal.json live?
17:14:09  <nahamu>looks like in /var/tmp/networking there's a coal.json that looks relevant.
17:21:21  <cxe1>/var/tmp
17:21:58  <cxe1>coal.json under /var/tmp/networking it is your network config
17:24:44  <nahamu>I don't see a file with that name there.
17:25:25  <cxe1>are you in manta0 zone?
17:26:06  <nahamu>I wasn't, but now I am. I see a lab-config.json in there.
17:26:22  <cxe1>there you go
17:26:46  <nahamu>and I see an empty section for madtom.
17:28:15  <cxe1>add madam image UUID following format from the other sections
17:29:07  <nahamu>do I need to add updates.joyent.com as a source and import it?
17:29:18  <nahamu>(where do I get the UUID to use?)
17:29:43  <nahamu>nevermind, I see it in imgadm avail
17:30:27  <cxe1>sdc-imgadm list | grep madtom
17:30:33  <jayschmidt><nahamu> - you can also run "manta-adm genconfig lab" and it will give you the uuid's for the bits that "manta-adm genconfig coal" doesn't deploy (madtom, marlin-dash, and - I think- medusa).
17:32:08  <nahamu>how do I point mantadm to know to read from that /var/tmp/lab-config.json?
17:32:13  <nahamu>*manta-adm
17:32:40  <jayschmidt>you pass the file to use along to manta-adm update
17:33:08  <jayschmidt>Then it will look at the running config, compare to what you're asking it to do, and say "right, I need to deploy X or undeploy Y"
17:33:12  <nahamu>cool, that seems to be working
17:35:00  <nahamu>okay, so now I just need to hit madtom on port 80 to see that everything looks healthy?
17:35:07  <cxe1>yes
17:35:38  <cxe1>if you are on CoaL just use the port. If you are on something else you might need to build an SSH tunnel
17:36:06  <cxe1>something like this ssh -L 8080: -f [email protected] -N
17:38:04  <jayschmidt>In a perfect world, madtom will be nice and green.
17:38:19  <nahamu>the madtom zone doesn't seem totally happy...
17:38:47  <nahamu>mdata:execute seems to have failed and manta/application/registrar is offline.
17:43:51  <jayschmidt>I would check the logs from mdata:execute, try restarting it to see if that clears it, then try a reboot of the madtom zone. Worst case I would try and undeploy and redeploy the madtom zone.
17:45:08  <cxe1>you have to wait for the name service to be deployed and started
17:45:25  <cxe1>if you click yes prematurely bad things will hppen
17:45:41  <nahamu>click yes to what?
17:46:28  <nahamu>right, it's erroring out with some sort of zookeeper is not running message.
17:46:33  <cxe1>oh. Looks like deploy-manta-coal skip a portion where you click "yes/No" to continue
17:46:48  <cxe1>I files a bug but could not reproducee it
17:47:05  <cxe1>run manta-factoryreset and follow the screen closely
17:47:21  <nahamu>in the manta zone or the GZ?
17:52:14  <nahamu>is there a similar sanity check I can do without having to deploy madtom?
17:53:32  <jayschmidt>What I normally do is manta-login to the ops zone once everything has come up, then I try and put a file, get a file, and run a job - that way you hit the metadata tier, the storage tier, and the compute tier.
17:54:11  <jayschmidt>In this case your problem may go beyond madtom - it's fairly lightweight (I think it's a 256MB vm), so your issue may be beyond that.
18:38:27  <nahamu>okay, I destroyed the Manta install and redid it and then added madtom and everything is green, so that's good.
18:38:46  <nahamu>jayschmidt: is "manta0" the "ops zone" you were talking about?
18:39:47  <jayschmidt>There should be a straight "ops" zone - if you do "manta-login ops" from the GZ of the HN it should find it.
18:41:56  <nahamu>should mlogin work too?
18:42:05  <nahamu>(from the ops zone)
18:46:19  <jayschmidt>there is some weirdness w/ the way that medusa has to be setup.
18:46:27  <jayschmidt>To be completely honest, I've only set it up once.
18:46:40  <nahamu>I'll skip that for now then.
18:47:05  <nahamu>but I got put, get, and mjob create to work, and madtom shows everything green, so I think I'm good.
18:47:09  <jayschmidt>whoo!
18:47:12  <jayschmidt>Nice.
18:47:16  <nahamu>Now I can test a SmartOS patch I want to upstream.
18:47:23  <jayschmidt>Even better.
18:47:48  <nahamu>I'm pretty sure if I can still get Manta set up and happy it will be considered to have been tested enough to be merged.
20:45:27  <nahamu>https://github.com/joyent/sdc-manta/pull/2
22:43:17  <cxe1>I am getting it a lot 1 dependent service is not running: "svc:/manta/application/registrar:default"
22:43:22  <cxe1>How do I fix it.
22:43:50  <cxe1>seems the only retrovision the service solve it but I want to be able to fix it inside the zone
22:44:21  <rmustacc>Well, the first step is to understand why isn't the dependent service running.
22:44:32  <rmustacc>Is it in maintenance, is it offline, is it something else?
22:44:49  <cxe1>svc:/smartdc/mdata:execute in in maintenance
22:45:25  <rmustacc>Okay, why did it enter maintenance?
22:45:25  <cxe1>everything else is running but I moved authcache from once CN to another
22:45:35  <cxe1>svc:/manta/application/registrar:default is offline
22:45:59  <cxe1>1 dependent service is not running
22:46:06  <cxe1>svc:/manta/application/registrar:default
22:46:30  <cxe1>so alway one in maintenance and one offline
22:47:27  <cxe1>reason "is not running because a method failed." for vc:/manta/application/registrar:default
22:52:06  <cxe1><rmustacc> seeing error ZooKeeper is not running. But i see it is running just fine
22:52:55  <cxe1>does it have to run on the same host as authcache
22:53:01  <cxe1>?
