TREX: Logbook Entries

TREX: base Messages: 7 Entries..

Return to Logbook Contents Page
Entry Date Title Site Author #Graphics
245 Wed 26-Apr-2006Aster outage when docking station hardware suddenly and permanently failedbasepoulos
189 Fri 14-Apr-2006Hacker Attack - Changed ssh daemon allowsmilitzer
138 Sun 02-Apr-2006how to restart covarsbasehorst
93 Wed 22-Mar-2006Network disconnectedbasesemmer
92 Wed 22-Mar-2006CRASH!basesemmer
35 Fri 10-Mar-2006dsm crashbaseoncley
10 Mon 27-Feb-2006monitoring toolsbaseoncley


245: base, Site base, Wed 26-Apr-2006 11:40:53 PDT, Aster outage when docking station hardware suddenly and permanently failed
At 615am local time it was observed that Aster at the Base trailer was off. The author had never seen Aster shut down the entire experiment and had not shut it down when departing the trailer at approximately 2000 25 Apr.Approximately 2 minutes later the sounding team, Bob Street, Stanford and Serena Chew, DRI, showed up for the morning sounding during the current IOP of T-REX. They noted that lightning had occured overnight, near MAPR, after Midnight local.

An unsuccessful attempt was made to turn on the system at ~0620 after checking power cords and connectors (noting that the power was on to all drives, the monitor, the modems, the Terabeam, the lights, the Leeds sounding computer and various surge protectors. Thus, it appeared that either Aster or the computer port had experienced a catastrophic failure of some kind. Further investigation revealed that
1) the Aster machine would not turn on with battery power (it had drained after power to it had failed),
2) the computer Docking Station would not turn on with when powered from any plug, including known functional power
3) that there was no access to a fuse on the Docking Station through any panels etc.
4) the original 90W power cable for the Aster (Dell Latitude D800) was not in the trailer (two Poulos searches, one Chew search),
5) the power cord for Poulos' Dell Latitude D410 would apply power to Aster, but was of 65W and therefore subject to lesser performance
6) internet access was down in the trailer,
7) Agee reported from online in Bishop that the data plots had been down since before midnight (~2300 local), which when combined with the fact that no other electronic equipment was down, eliminated to a significant degree the 'lightning' theory,
7) #5 allowed some power to the battery and Aster could be functional - leading to the conclusion that indeed the Docking Station had a catastrophic failure.

  After lengthy trouble shooting and Semmer looking up the viability of running Aster on 65W power, we decided to bring up Aster on 65W power. The only peripherals plugged in were the 12.0 and 14.0 connections to limit power usage. 

  Fortunately, this worked after a 2nd hardboot. The GPS time unit was subsequently plugged in as a crucial element of timekeeping after consultation with Gordon. This created no power or performance issues on sub-optimally-powered-Aster.

  While no data had been written to Aster over the wireless, pings, file listings and data_stats of the 3 towers indicated that indeed the local data storage was complete overnight during the Aster outage. No data had been lost and soon thereafter files were filling on Aster with on-line data plots updating.

  Subsequently, it was decided after discussions with Gordon/Steve S. to one-by-one attach various peripherals to Aster in its weak power state in hopes of retaining near normal functionality. An attempt was first made to enable the use of the 19" computer screen rather than the Aster laptop screen - this was unsuccessful - with electronic gibberish being the onscreen result. We thus are using Aster without the big screen until Golubieski arrives with the 90W power supply from Boulder (none were available at Schat.net in Bishop when Agee arrived first thing this morning and it would be 4-5 days to ship).

  Agee also used a multi-meter to assess the circuits on the Docking station power supply. He found 'beeps' (circuit completion) when the meter was attached to the cord itself plugged in, but not when attached to the male pins at the back of the Docking Station. Our preliminary conclusion is that the internal power supply has failed or a fuse has blown within the Docking Station. Once that occured, apparently randomly, battery power allowed Aster to function temporarily while the battery ran down to zilch. Thus the state at 0615 26 Apr.

  Gordon believes that since exthd7 and 9 are powered separately that their draw will be limited from the USB when plugged. Steve S. has suggested doing this in half-hour stages to ensure Aster functionality at each step.

  We also agreed that since the data acquisition is quite solid onsite, an IOP is underway and the Central tower wireless still potentially shaky after yesterday's tests, that off-loading local storage is not advisable at this time to fill in the online data plot gaps.



189: base, Site , Fri 14-Apr-2006 12:27:23 PDT, Hacker Attack - Changed ssh daemon allows
14Apr06,
Check /var/log/messages 

I just happened to be here, and saw continuous hack attacks every 5sec of the following type...

Apr 14 11:51:07 localhost sshd(pam_unix)[3286]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=61-219-243-114.hinet-ip.hinet.net  user=sshd
Apr 14 11:51:12 localhost sshd(pam_unix)[3288]: check pass; user unknown
Apr 14 11:51:12 localhost sshd(pam_unix)[3288]: authentication failure; logname= uid=0 euid=0 tty=ssh ruser= rhost=61-219-243-114.hinet-ip.hinet.net

Gordon (Mammothing) suggested shutting down the ssh daemon:
/etc/init.d/sshd stop
a man sshd_config
didn't reveal to me at least a specific 'allow-hosts' so instead i added a line to the /etc/ssh/sshd_config file:
	#Editing 14Apr06, jwm, to restrict logins due to observed hack-attack.
	#hopefully this will do the trick without bumping boulder, and locally on site.
	AllowUsers aster@192.168.* aster@128.117.80*

and saved an original copy in: sshd_config.save
and finally:

/etc/init.d/sshd start

138: base, Site base, Sun 02-Apr-2006 11:08:32 PDT, how to restart covars
To restart covar/netcdf process:

[aster@aster ~]$ cd ~aster
[aster@aster ~]$ ./runstats_sock
18612: old priority 0, new priority 19
[aster@aster ~]$ ps -ef |grep stats
aster    18615     1  3 10:49 ?        00:00:00 statsproc sock:localhost:30000

To catch up covar/netcdf in ~aster edit runstats.sh for days to rerun, e.g.

statsproc isff_20060402_*.dat

Then:

[aster@aster ~]$ batch
at> ./runstats.sh
at> D
job 18 at 2006-04-02 10:52
[aster@aster ~]$ at -l
18      2006-04-02 10:52 = aster
[aster@aster ~]$ ps -ef | grep stats
aster    18615     1  2 10:49 ?        00:00:07 statsproc sock:localhost:30000
aster    18642 18641  0 10:52 ?        00:00:00 /bin/sh ./runstats.sh
aster    18644 18642 93 10:52 ?        00:02:23 statsproc isff_20060401_000000.dat isff_20060401_040000.dat isff_20060401_080000.dat isff_20060401_120000.dat isff_20060401_160000.dat isff_20060401_221636.dat isff_20060401_224518.dat




93: base, Site base, Wed 22-Mar-2006 20:36:58 PST, Network disconnected
Charlie called Cebridge and found out that they had disabled our modem,
After a few minutes they activated it again. It may get disabled again
so we will leave it on the school link until tomorrow.

92: base, Site base, Wed 22-Mar-2006 19:35:16 PST, CRASH!
Around 7:00pm we had a major crash. The first thing to go down was the
network interface. Charlie believes it was a Cebridge problem. I then
notice that the aster system was not working correctly. AA hard reboot of
aster had to be done. It refused to allow su login. It would not come
up properly, hung at swap disk space.Recycled power again and it came up ok.
Charlie is still working on the network link.


35: base, Site base, Fri 10-Mar-2006 08:35:26 PST, dsm crash
The dsm process crashed again last night at 1720 local (actually while we were 
here, but we didn't catch it :( ).  This probably was associated with a 
TPOP crash, of which we had lots yesterday.  All stations stored data locally.

10: base, Site base, Mon 27-Feb-2006 21:34:47 PST, monitoring tools
data_stats sock:localhost [gives all nids channels]

Alico AP24 winbox (on Toughbook) [gives West's status]

TeraBeam navigator/configure/click on 192...address/"configure remote"/
"AP Associated ..." (on Toughbook) [gives Etherant/AP24 internal status]

More Etherant status on their own WWW sites.  (see /etc/hosts for addresses)
[set up as pull-down bookmarks from mozilla on aster base]
passwords should be saved by mozilla.