Bug 135158 - Local login session fails
Summary: Local login session fails
Status: RESOLVED FIXED
Alias: None
Product: kdm
Classification: Miscellaneous
Component: general (show other bugs)
Version: unspecified
Platform: Compiled Sources Solaris
: NOR crash
Target Milestone: ---
Assignee: kdm bugs tracker
URL:
Keywords:
Depends on:
Blocks:
 
Reported: 2006-10-05 17:23 UTC by Jens Hatlak
Modified: 2008-05-19 17:30 UTC (History)
0 users

See Also:
Latest Commit:
Version Fixed In:


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Jens Hatlak 2006-10-05 17:23:46 UTC
Version:            (using KDE KDE 3.5.4)
Installed from:    Compiled From Sources
Compiler:          gcc 4.0.3 
OS:                Solaris

I'm seeing the following on Sparc/Solaris 8: With KDM from KDE 3.5.4, remote login (e.g. Xnest -query) is working for all users. Local login sessions fail, though, i.e. users can log in but the session crashes, even the failsafe session. With XDM, but remote and local logins succeed. Previously I had KDE 3.4 installed, the KDM of which was working, too.

Maybe I should note that we are using the automounter for our home directories.

I attached truss -f (strace equivalent) to a running KDM process and found the following:

3137:   setuid(8739)                                    = 0
3137:   stat("/home/default/.Xauthority-c", 0xFFBEDB18) Err#2 ENOENT
3137:   open("/home/default/.Xauthority-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = 6
3137:   close(6)                                        = 0
3137:   link("/home/default/.Xauthority-c", "/home/default/.Xauthority-l") = 0
3137:   creat64("/home/default/.Xauthority-n", 0600)    = 6
3137:   open64("/home/default/.Xauthority", O_RDONLY)   Err#2 ENOENT
3137:   so_socket(2, 2, 0, "", 1)                       = 7
3137:   ioctl(7, 0xC00C6982, 0xFFBEE3F4)                = 0
3137:   ioctl(7, 0xC0106978, 0xFFBEE3E4)                = 0
3137:       Incurred fault #5, FLTACCESS  %pc = 0x00021604
3137:         siginfo: SIGBUS BUS_ADRALN addr=0xFFBED707
3137:       Received signal #10, SIGBUS [default]
3137:         siginfo: SIGBUS BUS_ADRALN addr=0xFFBED707

8739 is the user id of the test user 'default'. As you can see, switching to user 'default' succeeds, but a crash occurs afterwards.

In comparison, an extract from a successful remote login session looks like this:

5713:   setuid(8739)                                    = 0
5713:   stat("/home/default/.Xauthority-c", 0xFFBEDA70) Err#2 ENOENT
5713:   open("/home/default/.Xauthority-c", O_WRONLY|O_CREAT|O_EXCL, 0600) = 3
5713:   close(3)                                        = 0
5713:   link("/home/default/.Xauthority-c", "/home/default/.Xauthority-l") = 0
5713:   creat64("/home/default/.Xauthority-n", 0600)    = 3
5713:   open64("/home/default/.Xauthority", O_RDONLY)   = 5
5713:   fstat64(3, 0xFFBEDF80)                          = 0
5713:   ioctl(3, TCGETA, 0xFFBEDF0C)                    Err#25 ENOTTY
5713:   fstat64(5, 0xFFBEE2C8)                          = 0
5713:   chmod("/home/default/.Xauthority-n", 0600)      = 0
5713:   fstat64(5, 0xFFBEE028)                          = 0
5713:   ioctl(5, TCGETA, 0xFFBEDFB4)                    Err#25 ENOTTY
5713:   read(5, 0x00071AF4, 8192)                       = 0
5713:   llseek(5, 0, SEEK_CUR)                          = 0
5713:   close(5)                                        = 0
5713:   write(3, "\0\0\00482 SA085\002 2 2".., 50)      = 50
5713:   close(3)                                        = 0
5713:   unlink("/home/default/.Xauthority")             = 0
5713:   link("/home/default/.Xauthority-n", "/home/default/.Xauthority") = 0
5713:   unlink("/home/default/.Xauthority-n")           = 0
5713:   unlink("/home/default/.Xauthority-c")           = 0
5713:   unlink("/home/default/.Xauthority-l")           = 0
5713:   chdir("/home/default")                          = 0
Comment 1 Oswald Buddenhagen 2006-10-06 07:57:22 UTC
the two traces aren't exactly comparable, because in the first one ~/.Xauthority does not exist yet, while in the second one it does.

it would be very helpful if you could attach a debugger to the session subdaemon (put it in trace child on fork() mode), so we can get a useful backtrace.
Comment 2 Jens Hatlak 2006-10-09 15:42:26 UTC
I think whether whichever file exists or not is not important. The two are not comparable because the one is from a remote login while the other is from a local login. The error is somewhere else.

The debugger was not very helpful since I did not have a debug-enabled kdebase/kdm. Here is what I got (gdb, set follow-fork-mode child, attach <pid>):

(gdb) bt
#0  0xff09da48 in _poll () from /usr/lib/libc.so.1
#1  0xff04d618 in select () from /usr/lib/libc.so.1
#2  0x0001daf4 in main ()

That's all. After some thinking I remembered that I took a note about how to enable full debug output for kdm (-debug 0x101). This showed something interesting:

Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] SetUserAuthorization
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] XauLockAuth /home/default/.Xauthority
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] lock is 0
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] opens succeeded /home/default/.Xauthority /home/default/.Xauthority-n
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] 1 authorization protocols for :0
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] writeLocalAuth: :0 MIT-MAGIC-COOKIE-1
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] setAuthNumber :0
Oct  9 12:23:12 client100 kdm: :0[6568]: [ID 197553 daemon.debug] setAuthNumber: 0

This is the last line that appears for process 6568. As I found out, this output comes from setAuthNumber() invoked in writeLocalAuth() in backend/auth.c. So I guess the crash comes from something in either writeLocalAuth() or one of the functions invoked in there, namely DefineSelf() and DefineLocal().

I hope this is enough information for you to start thinking what might happen here.

I also found two workarounds. Any of these allows me to login locally.
a) set AuthDir=/tmp in kdmrc (default: /var/run/xauth)
b) set Authorize=false in kdmrc
Comment 3 Jens Hatlak 2006-10-11 15:52:10 UTC
I rebuilt KDM with debug enabled. Surprisingly, I cannot reproduce the crash with the KDM binaries from that try. I'm currently rebuilding with debug disabled, just to be sure.

Running the debug binaries with -debug 0x101, I get the same output as in my last comment, but other that there, the process does not die. Instead, the following is printed after "setAuthNumber: 0":

Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] ConvertAddr returning 0 for family 2
Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] DefineSelf: write network address, length 4
Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] writeAddr: writing and saving an entry
Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] writeAuth: doWrite = 1
Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] family: 0

and some more lines containing addresses that I'm not posting since I guess they should not be made public. After these lines, the session initialization begins:

Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] new authorization moved into place
Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] done SetUserAuthorization
Oct 11 15:35:04 client100 kdm[12120]: [ID 197553 daemon.debug] select returns 1
Oct 11 15:35:04 client100 kdm: :0[12131]: [ID 197553 daemon.debug] executing session "/usr/local/lib/kde/share/config/kdm/Xsession" "failsafe"

and I have a working failsafe session.
Comment 4 Jens Hatlak 2006-11-06 18:06:47 UTC
We finally found the source of the problem in Solaris PAM which requires PAM_RHOST be set (pam_unix_session). See OpenSolaris bug ID 4777938 or
http://mail.gnome.org/archives/gdm-list/2004-February/msg00016.html. Linux PAM seems to accepts NULL for that value - see http://www.kernel.org/pub/linux/libs/pam/Linux-PAM-html/mwg-expected-by-module-item.html.

Workaround: add the following to /etc/pam.conf:

kdm     session required                pam_sample.so.1

BTW: The same applies to GDM 2.6.0.9.

The real solution would be to set RHOST to the null string for Solaris <= 9.
Comment 5 Jens Hatlak 2006-11-07 23:38:29 UTC
FWIW, current GDM versions seem to be affected, too:
http://cvs.gnome.org/viewcvs/gdm2/daemon/verify-pam.c?view=markup
("Only set RHOST if host is remote")
Comment 6 Oswald Buddenhagen 2007-04-11 16:34:23 UTC
SVN commit 652591 by ossi:

work around solaris < 10 PAM breakage. will backport immediately.
BUG: 135158


 M  +4 -0      client.c  


--- branches/KDE/3.5/kdebase/kdm/backend/client.c #652590:652591
@@ -316,6 +316,10 @@
 		if (pretc != PAM_SUCCESS)
 			goto pam_bail;
 	}
+# ifdef __sun__ /* Only Solaris <= 9, but checking it does not seem worth it. */
+	else if (pam_set_item( pamh, PAM_RHOST, 0 ) != PAM_SUCCESS)
+		goto pam_bail;
+# endif
 # ifdef PAM_FAIL_DELAY
 	pam_set_item( pamh, PAM_FAIL_DELAY, (void *)fail_delay );
 # endif