Thursday, March 30, 2006

disable zlogin

Currently have scripts
running during the provisioning of zones to populate
SOE packages and preliminary configuration that need to
be done prior to granting access to anyone, including
any global administrator until after the configuration
is complete.

To avoid zlogin to local zone,
Add the following lines to the
zone's /etc/pam.conf just about
the "other auth" lines

#
# disable zlogin
zlogin auth required pam_deny.so.1

Tuesday, March 28, 2006

Solaris motd issue

(1) Here is how login(1) routine works
/usr/sbin/quota (check quota)
/bin/cat -s /etc/motd (print motd)
/bin/mail -E (check mail)

(2) Here is how mibiisa(1M) works as SNMP agent utility
motd is part of sunsystem group for general
system information reporting. The first line
of /etc/motd. (string[255])

(3) For JES link, I have JESQ4 on my system, it does
not show the link. Have you check which component
create the link ?

(4) A few line D code below may help you to discover the issue


performance analysis for the alogrithm
counts the cost the steps of the random
access machine which is to modeled for
the instrumentation.

Consequently, higher level syscall
instrumentation
symlink(*char* *target , *char* *linkname)
tracing seems friendly for the implementation
of the code. It does not give a plus to performance
Therefore, I would keep the routine as close to
the I/O layer in order to mini the cost of the
delegation of the layered kernel architecture.

I would suggest to add directive as condition
rule to point to the path to the motd in order
to filter out the I/O.

symlink(2)does
the link and rename only. AI and Algoritm
calculation does make sense. The
implementation of my AI and Algorithm
is enhanced as code below.

Please let me know if it works on your system



#! /usr/sbin/dtrace -s
#pragma D option quiet

dtrace:::BEGIN
{
printf("%15s %40s\n", "Executable", "LinkFileName");
}
/* Please note input here is link file name not path */
fbt::fop_symlink:exit
/stringof(args[1]) == $$1/
{
printf("%15s %40s\n", execname,stringof(args[1]));
}

In addition, you can seperate the R/W to further
narrow down the report.

Here is a script that will print the time, name of the executable,
and ptree output when anyone tries to link /etc/motd

#!/usr/sbin/dtrace -wqs

syscall::symlink:entry
/basename(copyinstr(arg1))=="motd"/
{
printf("Caught the culprit\n");
printf("%20s\t %-20Y\n", "Time",walltimestamp);
printf("%20s\t %-10d\n", "Process id",pid);
printf("%20s\t %-20s\n", "Name of Executable" ,execname);
stop();
system("ptree %d",pid);
system("prun %d",pid);
}

Also if they want to use DTrace to automatically avoid the process
from creating the link they can use the script below. This would cause
any link to /etc/motd to become a link to /tmp/motd and then remove the
/tmp/motd file.

#!/usr/sbin/dtrace -wqs

syscall::symlink:entry
/copyinstr(arg1)=="/etc/motd"/
{
printf("Caught the culprit\n");
printf("%20s\t %-20Y\n", "Time",walltimestamp);
printf("%20s\t %-10d\n", "Process id",pid);
printf("%20s\t %-20s\n", "Name of Executable" ,execname);
copyoutstr("/tmp/motd",arg1,9);
stop();
system("ptree %d",pid);
system("prun %d",pid);
system("rm /tmp/motd");
}

Monday, March 27, 2006

Second hit on OS kernel architecture model design

As computer science illustrates:
a goal based model agent does the routine
perception, rule matching and goal mapping
in order to approximate the actions to deal
with the subset of the unobserved conditions.
Consequently, what will be the input of the
change ? kernel, specifically core kernel
modules seems relative stable. What else ?
security patches ? certified third party
drivers ? system library ? user lander
applications ? Is this the time to review the
challenge traditional open system kernel
architecture model design to deal with
complexity ? It is the common control,
plan and game algorithms, models along with
economic models inspires CS scientists
to review the challenges for OS vendors.
Microsoft is not alone. I will be valuable
to investigate the asymptotical notation
for both best cases and worst cases.

On the another hand, what will be business
intelligence to protect OS vendor market
shares along with the user land applications
for OS players ? Does the user land application
continue lock end users ? What will be the
economic and innovative delivery model for
both OS vendors and application providers ?
What OS vendors can really get from open
sources or open services environment ?

Many and Many questions and thoughts ?!


http://www.nytimes.com/2006/03/27/technology/27soft.html?hp&ex=1143522000&en=1c725e1c50ae8d6c&ei=5094&partner=homepage

Tuesday, March 21, 2006

System Event Handling for Non Global Zone via GPEC event queue channel

System event is not allowed in NG-zone. However, system calls
via sysevent 3SYSEVENT is the working solution

sysevent_bind_handle(3SYSEVENT) – bind or unbind subscriber handle
sysevent_free(3SYSEVENT) – free memory for sysevent handle
sysevent_get_attr_list(3SYSEVENT) – get attribute list pointer
sysevent_get_class_name(3SYSEVENT) – get class name, subclass name, ID or buffer size of event
sysevent_get_pid(3SYSEVENT) – get vendor name, publisher name or processor ID of event
sysevent_get_pub_name(3SYSEVENT) – get vendor name, publisher name or processor ID of event
sysevent_get_seq(3SYSEVENT) – get class name, subclass name, ID or buffer size of event
sysevent_get_size(3SYSEVENT) – get class name, subclass name, ID or buffer size of event
sysevent_get_subclass_name(3SYSEVENT) – get class name, subclass name, ID or buffer size of event
sysevent_get_time(3SYSEVENT) – get class name, subclass name, ID or buffer size of event
sysevent_get_vendor_name(3SYSEVENT) – get vendor name, publisher name or processor ID of event
sysevent_post_event(3SYSEVENT) – post system event for applications
sysevent_subscribe_event(3SYSEVENT) – register or unregister interest in event receipt
sysevent_unbind_handle(3SYSEVENT) – bind or unbind subscriber handle
sysevent_unsubscribe_event(3SYSEVENT) – register or unregister interest in event receipt

How to resolve fsflush overhead as large sized memory mapped

To reduce the I/O on Solaris platform, Solaris does
offer virtual file system which is memory-based file systems
that provide access to kernel specific resources.
As the name indicates,virtual file systems do not
use file system disk space. However, tmpfs use the
swap space on a disk. tmpfs is the default file system
type for the /tmp directory in the Solaris.

Since uses local memory for file system reads and writes,
it has much more lower latency than using classic Solaris
UFS. I/O performance can be enhanced by reduced I/O
to a local disk or across the network in order to significantly
speed up their creation, manipulation etc. Therefore, tmpfs
can be utlized for the memory mapping.

Files in TMPFS file systems are votial. The files will be dispeared
as the file system is unmounted and when the system is shut down
or rebooted. Files can be moved into or out of the /tmp directory.
This means KTS needs to ensure completed process image backup
as normal fsflush does.

Please note that tmpfs uses swap space for pageout. Process will
be executed as system does not have enough swap space. Which means
it requires larger swap space for pagout.


Other than Solaris built-in kernel modules, Sun Storage Cache also can
be utlized to reduce the latency of the I/O activities.


To ensure no paging on Solaris platform, shmop(2) with shmget(2)
and Intimate Shared Memory variant of System V shared
memory. ISM* *mappings are created with the SHM_SHARE_MMU flag.
This locks down the memory used. Then just read the file
into shared memory. But this may result in code change.



If this is not an option you can tune the flusher with system parameters

set segspt_minfree
set swapfs_minfree
set lotsfree
set desfree
set minfree

You can also postpone the time between cleaning with

set autoup

Monday, March 20, 2006

Solaris Page-Demand Memory Management

Pertaining Solaris Kernel Proc mgt and Memory Mgt
architecture, Heap segment within Process Virtual
Address Space is allocated for user land data structure
righ above executable data segment of the user land
DB process and grow with the libc.so.1 system
library call such as malloc(3c) which malloc_unlocked does
the dirty work to allocate holding blocks or ordinary
blocks for user land process. If there is no block sbrk(3c) is
called.

It is transparent zero-fill-on-demand memory page
allocation because of the page fault.
Page memory is allocated for the process heap and
space becomes a permanently allocated block.

(1) However, the allocation will not shrink until
process exits.
(2) Page scanner daemon runs to page out memory page
per LRU due to the shortage of memory

This is the core of on demand page memory management
architecture of Solaris Operating System.

As for the free(3c) which free_unlocked is doing the dirty
work to mark the address space as free list
for later use but not to return address space to memory
resource managed pool.

Solaris Basic Library

These default memory allocation routines are safe for use
in multithreaded applications but are not scalable.
Concurrent accesses by multiple threads are single-threaded
through the use of a single lock. Multithreaded applications
that make heavy use of dynamic memory allocation should be
linked with allocation libraries designed for concurrent access,
such as libumem(3LIB) or libmtmalloc(3LIB). Applications that
want to avoid using heap allocations (with brk(2)) can do so
by using either libumem or libmapmalloc(3LIB). The allocation
libraries libmalloc(3LIB) and libbsdmalloc(3LIB) are
available for special needs.

Saturday, March 18, 2006

Faster zone provisioning using zoneadm clone and Dtrace to monitoring zone

There is a wonderful blog on zone, I am transfering one engineer's
test below:

Faster zone provisioning using zoneadm clone

creating zones in parallel to reduce the time it takes to provision multiple zones, it was suggested that the new zoneadm clone subcommand could be of help. The zoneadm clone subcommand (available from build 33 onwards) copies an installed and configured zone. Cloning a zone is faster than installing a zone, but how much faster? To find out an Engineer did some quick experiments creating and cloning both whole root and sparse root zones on a V480:

Creating a whole root zone:

# zonecfg -z zone1
zone1: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:zone1> create -b
zonecfg:zone1> set zonepath=/zones/zone1
zonecfg:zone1> exit
# time zoneadm -z zone1 install
time zoneadm -z zone1 install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <123834> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <986> packages on the zone.
Initialized <986> packages on zone.
Zone is initialized.
Installation of these packages generated errors:
The file contains a log of the zone installation.

real 13m40.647s
user 2m49.840s
sys 4m43.221s

Cloning a whole root zone:

# zonecfg -z zone1 export|sed -e 's/zone1/zone2/'|zonecfg -z zone2
zone2: No such zone configured
Use 'create' to begin configuring a new zone.
# time zoneadm -z zone2 clone zone1
Cloning zonepath /zones/zone1...

real 8m4.615s
user 0m9.780s
sys 2m18.334s

For the whole root zone cloning is almost twice a fast as a regular install.

Creating a sparse root zone:

# zonecfg -z zone2
zone3: No such zone configured
Use 'create' to begin configuring a new zone.
zonecfg:zone3> create
zonecfg:zone3> set zonepath=/zones/zone3
zonecfg:zone3> exit
# time zoneadm -z zone3 install
Preparing to install zone .
Creating list of files to copy from the global zone.
Copying <2535> files to the zone.
Initializing zone product registry.
Determining zone package initialization order.
Preparing to initialize <986> packages on the zone.
Initialized <986> packages on zone.
Zone is initialized.
Installation of these packages generated errors:
The file contains a log of the zone installation.

real 6m3.227s
user 1m45.902s
sys 2m47.717s

Cloning a sparse root zone:

# zonecfg -z zone3 export|sed -e 's/zone3/zone4/'|zonecfg -z zone4
zone4: No such zone configured
Use 'create' to begin configuring a new zone.
# time zoneadm -z zone4 clone zone3
Cloning zonepath /zones/zone3...

real 0m11.535s
user 0m0.706s
sys 0m6.440s

For the sparse root zone, cloning is more than thirty times faster then installing!

So if you need to provision multiple zones of a certain configuration, zoneadm clone is clearly the way to go.

Note that the current clone operation does not (yet) take advantage of ZFS. To see what ZFS can do for zone cloning, have a look at Mike Gerdts' blog: Zone created in 0.922 seconds. Goodness indeed.

T: OpenSolaris Zones
( Mar 18 2006, 07:12:17 PM CET ) Permalink Comments [1]
20050525 Wednesday May 25, 2005
Monitoring zone boot and shutdown using DTrace

Several people have expressed a desire for a way to monitor zone state transitions such as zone boot or shutdown events. Currently there is no way to get notified when a zone is booted or shutdown. One way would be to run zoneadm list -p at regular intervals and parse the output, but this has some drawbacks that make this solution less ideal:

* it is inefficient because you are polling for events,
* you will probably start at least two processes for each polling cycle (zoneadm(1M) and nawk(1)),
* more importantly, you could miss transitions if your polling interval is too large. Since a zone reboot might take only seconds, you would need to poll often in order not to miss a state change.

A better, much more efficient solution can be built using DTrace, the 'Swiss Army knife of system observability'. As mentioned in this message on the DTrace forum, the zone_boot() function looks like a promising way to get notifications when a zone is booted. Listing all FBT probes with the string 'zone_' in their name (dtrace -l fbt|grep zone_) turns up another interesting function: zone_shutdown(). To verify that these probes are fired when a zone is either booted or shutdown, let's enable both probes:

# dtrace -n 'fbt:genunix:zone_boot:entry, fbt:genunix:zone_shutdown:entry {}'
dtrace: description 'fbt:genunix:zone_boot:entry, fbt:genunix:zone_shutdown:entry ' matched 2 probes

When zoneadm -z zone1 boot is executed we see that the zone_boot:entry probe fires:

CPU ID FUNCTION:NAME
0 6722 zone_boot:entry

The zone_shutdown:entry probe fires when the zone is shutdown (either by zoneadm -z zone1 halt or using init 0 from within the zone):

0 6726 zone_shutdown:entry

This gives us the basic 'plumbing' for the monitoring script. By instrumenting the zone_boot() and zone_shutdown() functions with the FBT provider we can wait for zone boot and shutdown with almost zero overhead. Now what is left is finding out the name of the zone that was booted or shutdown. This requires some knowledge of the implementation and access to the source (anyone interested can take a look at the source after OpenSolaris is launched, so stay tuned).

A quick look at the source shows that we can get the zone name by instrumenting a third function, zone_find_all_by_id() that is called by both zone_boot() and zone_shutdown(). This function returns a pointer to a zone_t structure (defined in /usr/include/sys/zone.h). The DTrace script below uses a common DTrace idiom: in the :entry probe we set a thread-local variable trace that is used as a predicate in the :return probes (the :return probes have the information we're after). The FBT provider :return probe stores the function return value in args[1] so we can access the zone name as args[1]->zone_name in fbt:genunix:zonefind_all_by_id:return and save it for later use in fbt:genunix:zone_boot:return and fbt:genunix:zone_shutdown:return.

#!/usr/sbin/dtrace -qs

self string name;

fbt:genunix:zone_boot:entry
{
self->trace = 1;
}

fbt:genunix:zone_boot:return
/self->trace && args[1] == 0/
{
printf("Zone %s booted\n", self->name);
self->trace = 0;
self->name = 0;
}

fbt:genunix:zone_shutdown:entry
{
self->trace = 1;
}

fbt:genunix:zone_shutdown:return
/self->trace && args[1] == 0/
{
printf("Zone %s shutdown\n", self->name);
self->trace = 0;
self->name = 0;
}

fbt:genunix:zone_find_all_by_id:return
/self->trace/
{
self->name = stringof(args[1]->zone_name);
}


Starting the script and booting and shutting down some Zones gives the following result:

# ./zonemon.d
Zone aap booted
Zone noot booted
Zone noot shutdown
Zone noot booted
Zone aap shutdown

Friday, March 17, 2006

Vritual Machine on x86

OS architecture design and implementation comes from
simple structure to classic layered Unix system approach
which Lunix and Windows follows. Furthermore, to simplify
the kernel manageability, Micro kernel architecture was
proposed. However due to performance and scalability,
Solaris modulization design won the game. One interesting
thing is that Mac OS X takes hybrid structure which bridge
the layered BSD kernel design with Microkernel implementation.
Microkernel manages memory, RPC, IPC and Kthread scheduling.
BSD kernel does the CLIs, file systems and all the POSIX APIs.

Traditional layered Solaris Kernel design concludes the concept
of abstracting the HW resource into several execution environments.
With such virtualization techniques, a process is provided with a
virtual copy of underline OS and HW resources.

Therefore the fundamental resource to run virtual machine is to
share the HW with different execution environments. In such way,
Virtual machine is running in the kernel mode and execute at the
user mode. It has relative virtual user and kernel modes. If there is
a process running in a virtual machine, the control will be transfered
from virtual machine monitor to change the register and process program
counter for simulating the system call. Hence the major difference is
the real I/O will take much more time than virtual I/O does. CPU instruction
time will increase due to the multi-processes running within each virtual
machine. Virtual machine model is the best fit for R&D

However, it seems virtual machine can help resolve system compatibility
issues. The two popular favors of the virtual machine are: vmware and
Java VM. Since virtual machines are running on the top of OS, the traditional
OS design and implementation such Solaris Modules, Microkernel, VM are
still applied.

VMware abstracts x86 platform into isolated virtual machines. VMware runs
as user land application on the top of host OS which enable multiple guest
OSs concurrently within each virtual machines. However, the virtualization
layer as the core the vmware is the most expensive design to abstract the
underline resources into various virtual machines as guest OSs. Each vm
has it's own CPU, Memory, devices etc.

JVM is also abtracting the underline OS and HW. It is through class loader
and Java interpreters to execute the byte codes.

In general, the question of the design and utilization of virtual machine
is depends on the level of virtualization which fits in the requirements.
For platform and system level virtualization across different guest
OSs, vmware is the choice. However, if you only want to virtualize the
user land applications specifclly for Java applications, JVM is the right
technical and political answers to acorss different OSs. An important
note, application level virtualization has been done significantly by Sun
ISVs such as Cassat for Java EE virtualization.

Thursday, March 16, 2006

Wireless TCP

Traditional TCP does not serve the wirless connection efficiently due to the conventional TCP design assumption on the congestion control and "friendly design" of the protocols. This leads to the slow start and fast retransmit/fast recovery. High Error Rate, mobility caused packet dropping and TCP's fundermental issue for the time-out of the missing ack casued by congestion means classic TCP does not work for mobile computing.

UDP leaves the reliable and retranmission to the application layer.


Improved TCP (ITCP) such as Indirect TCP using accessing point or FA for Mobile Node. Segementing TCP connection into 2 connections. Snooping TCP uses FA or access buffers all data packets. Mobile TCP uses SH-MI connections and persistent mode to resolve the issues. Selective retransmission is the good solution for the test. Transaction-oriented TCP combine the packes for connection establishment and connection release with user data packets to reduce the packet for 3 ways handshakes (WAP does the similar things). Header compressionn does the work to for gaming apps.

Wednesday, March 15, 2006

install skype on Ubuntu

mkdir skype
mv skype_1.2.0.18-1_i386.deb skype_1.2.0.18-1_i386.deb.orig
dpkg-deb –extract skype_1.2.0.18-1_i386.deb.orig skype
dpkg-deb –control skype_1.2.0.18-1_i386.deb.orig skype/DEBIAN
vi skype/DEBIAN/control
Change to:
Depends: libc6 (>= 2.3.2.ds1-4), libgcc1 (>= 1:3.4.1-3), libqt3c102-mt (>= 3:3.3.3.2) | libqt3-mt, libstdc++5 (>= 1:3.3.4-1), libx11-6 | xlibs (>> 4.1.0), libxext6 | xlibs (>> 4.1.0)

dpkg –build skype
mv skype.deb skype_1.2.0.18-1_i386.deb
dpkg -i skype_1.2.0.18-1_i386.deb

Tuesday, March 14, 2006

package parameters for zone scope

SUNW_PKG_ALLZONES, SUNW_PKG_HOLLOW,SUNW_PKG_THISZONE

(1) SUNW_PKG_ALLZONES package parameter describes the zone scope of a package. This parameter defines the following:

*

Whether a package is required to be installed on all zones
*

Whether a package is required to be identical in all zones

The SUNW_PKG_ALLZONES package parameter has two permissible values. These values are true and false. The default value is false.

(2)

The SUNW_PKG_HOLLOW package parameter defines whether a package should be visible in any non-global zone if that package is required to be installed and be identical in all zones.

The SUNW_PKG_HOLLOW package parameter has two permissible values, true or false.

*

If SUNW_PKG_HOLLOW is either not set or set to a value other than true or false, the value false is used.
*

If SUNW_PKG_ALLZONES is set to false, the SUNW_PKG_HOLLOW parameter is ignored.
*

If SUNW_PKG_ALLZONES is set to false, then SUNW_PKG_HOLLOW cannot be set to true.

(3)

The SUNW_PKG_THISZONE package parameter defines whether a package must be installed in the current zone, global or non-global, only. The SUNW_PKG_THISZONE package parameter has two permissible values. These value are true and false. The default value is false.

(4)

If a package is installed with pkgadd -G or has the pkginfo setting SUNW_PKG_THISZONE=true, the package can only be patched with patchadd -G.

Zone and Solaris Harden

harden non-global zones using Solaris Security Toolkit not pkgrm.
harden the global zone using Solaris Security Toolkit so that any
subsequent non-global zones created,will automatically be hardened.

pkgrm is the underlying mechanism to remove software packages from Solaris.
If a package is zone-aware, you would use pkgrm to remove it from the zones.
Depending on what the customer's definition of "hardening" may be, it could be possible to satisfy this requirement without using pkgrm.



Basically, hardening the system should not cause issues.
That is as long as you don't remove basic zones functionality, I'm assuming you'll pkgrm some packages etc.
I'd suggest just trying it on a test system first.
The minumum cluster that zones functionality is deliverd in is SUNWCuser.
But it should be possible to start lower, i.e. SUNWCreq and build up, the following e-mail threads have some further discussion on this very topic.

Open Source and Open Service (OSS)

(1) Open Source vs Open Service (OSS)
Why, What, When and How for Sun and Partner
What are the nature of the problems to resolve ?
What are the related OSS has been visionlized
and implemented in the industry ?
(google, salesforce.com. Microsoft, ibm etc.)
(2) What will be engineering engagement platform for OSS ?
(3) What is the engineering engagement protocol for OSS ?
(4) What will be the engineering engagement execution
language for the OSS ?
(5) What will engineering engagement model for OSS ?
(6) What will be the engineering engagement layered service model
for OSS ?
(7) What will be strength and advantage of OSS vs traditional
engineering engagement ?
(8) What will be the impact to current engineering engagement
routine ?
(9) How to evaluate the performance of engineering engagement for
OSS
(10) How to ensure MDE and Tim's Team differentiate and out perform
engineering engagement for OSS
(11) What are the future work to be done ?

SPARC IV+ and T1 favors

SPARC IV+ and T1 are the different lines of platforms driving
industry requirements.

(1) From uts implementation point of view, sun4v addresses the
future platform architecture down the road from core kernel
implementation to FM architecture design. However, I do not
have specific core structure vs x64 core finity strcuture either.
This is has been fiting CMP SPARC IV+ for a long time since
Sun OS 2.6 from processor set to pid resource management.
This means sun4v requires more virtualization on processor
and core than traditional sun4u architecture and implementation.
(2) In addition, other than the firmware and HW architecture and
implmentation, Niagara address the throughput and latency
from bottom up as CMT promising.
(3) In process management, ABI contuines to offer the standard
for Solaris binary interface as part of ELF format for both platform
independent and processor specific system calls and stack mgt

(4) Network performance and throughput at Device level from traditional bge(7D) support to sustain scalable and throughput based volume
access

(5) Other than core uts processor implementation and system library
io, fpc IPC and px, pcb, vm, ebus, along with genunix implementation
including all loadable kernel modules such as device drivers, core
solaris kernel implementation are reused such as vm and file system
and scheduling as addressed in Solaris 10 core kernel implementation

so on so forth.................................

In general, there are quite of platform specific enhancement done
over T1 processor from kernel perspectives

(6) At user land, it depends on the application provider's architecture
and implementation to the level of utlizing process management
and resource allocation. What are the user land thread model designed
and implemented ? How LWP are created and how kthread is leveraged ?
How lock prmitives are designed at user land ? what thread libirary used
at user land for kthread creation and execution ?

Communication Service with SSO

the comms channels use SSO adapter to provide SSO with MS Exchange.

It is required to quickly get in MS Exchange without having to use
a full Directory Server DN matching the Active Directory DN

The Exchange plugin for the Mail channel uses the SSOAdapter
property "uid" for the IMAP user name and "password" for the password.

Niagara Process Image and ABI ELF format

There has been one thing since I have been working
on Niagara. From solaris process mgt point of view,
process image with ELF format addressed as part
of ABI standard starting the Solaris binary interface.

ELF addresses both platform independent and
processor specific specifications. For processor dependent
ABI standards include the routine sequence
(system calls, stack mgt etc) and Solaris interface for signals,
process initialization etc.

Does this mean Niagara inherit most the SPARC IV+
functional calls and stack mgt, process mgt and signal
interface ?

Monday, March 13, 2006

How to setup cisco VPN client on Ubuntu

1. Install


# sudo su -
# apt-get install build-essential linux-headers-`uname -r`
# echo tun >> /etc/modules
# modprobe tun
# cd VPNCLIENT_SRC_DIR
# ./vpn_install
# /etc/init.d/vpnclient_init start

2. edit sfbay.pcf files in /etc/CiscoSystemsVPNClient/Profiles


Description=Ebay VPN3000
Host=192.18.42.83
AuthType=1
GroupName=vpn
EnableISPConnect=0
ISPConnectType=0
ISPConnect=
ISPCommand=
Username=ll149252
SaveUserPassword=0
EnableBackup=1
BackupServer=ivpn-east.sun.com,ivpn-central.sun.com,ivpn-aus.sun.com
TunnelingMode=1
TCPTunnelingPort=10000
EnableLocalLAN=0
EnableNat=1
CertStore=0
CertName=
CertPath=
CertSubjectName=
CertSerialHash=00000000000000000000000000000000
DHGroup=2
ForceKeepAlives=0

3. connect to VPN

$vpnclient connect sfbay

4. Disconnect VPN

$vpnclient disconnect

kmem(7D) and kstat on NG-Z resource mgt and performance monitoring

(1) Regarding to Platform Computing

kmem(7D) is the device library with 3 open routines below

openkmem which opens /dev/kmem file descriptor
which read through the file by following routines
kemecpy
kstrncpy

For access to the virtual address space of Solaris kernel,
excluding memory associated with an I/O device.

However, kmem(7D)does not have full functionality in
a non-global zone.

(2) Regarding to HPOV

from Solaris process management point of view,
profs presudo file system export the abstraction
of kernel process mgt. There are a few user land
data strcuture to illustrate the performance mgt
needs. In addition kstat (1M) no longer sufficient
for NG-Z use case which address the uts structure
tagged with zoneid.

S10 Resource Mgt and HW resource Mgt

(2) S10 SRM and resource pool
Having resource pools enabled allows one to have virtualized statistics
for things like the CPU kstats, APIs like sysinfo(3C) and
getloadavg(3C) and utilities like mpstat(1M) and vmstat(1M). Basically
if pools are enabled, a zone will see a virtualize view of the relevant
statistics based on the pool the zone is bound to.

(3)FSS and processor resource usage

It's not as fine-grained as using solely FSS but it does provide a
great deal of flexibility including the ability to automatically set
the scheduling class of processes bound to the pool.


HW resource management approach, hypervisor

You can lose a lot of optimization if the
hypervisor abstracts too many hardware details (thread to processor
affinity is one such example). So while the more general purpose
the hypervisor (abstracts the most details) the fewer opportunities
for the OS to optimize (and in some cases they futilely optimize -
like a compulsion for a pointless or destructive activity).

The relevance here is that the abstraction layer presented by
Solaris zones is higher in the stack (near the user space layer)
so all of the platform specific optimizations are available to the
kernel. When you begin to think about the impact of optimizations
such multiple page sizes and memory placement, these details can become
very important. And of course reduced VM pressure from sharing a
common (but secure) buffer cache and shared libraries.

Mobile awared Resource Discovery with ad-hoc network

Scalability & Latency

1.Traditionally, physical entities such as a computer, network or storage system are considered as resource. With service oriented architecture, in addition to traditional physical resource, virtual services provide the consistent functionalities across the network.
2.Service Discovery: It is important for service consumer to identify service and characteristics of the service in order to understand the interfaces and authorized identity for services accessing.
3.Traditionally, service discovery is accomplished by service registry.
4.User tends to discover a service based on the knowledge of the service
5.Auto service discovery, search, selection, matching, composition and interoperation, invocation and execution requires a service description which is crucial
6.Functional classification or categories of the service is important for efficient way of querying and indexing a specific service
7.Discover, locating the network accessible capabilities, to support heterogeneous environment, standard RDS protocol and standard mechanism for expressing resource is required
8.Client normally query resource by properties such as capabilities, quality, terms, and configuration etc. Therefore the description language is demanded for resource discovery
9.Discovery is lightweight and non-authorized operation with no resource commitment. In addition, the aggregation of resource information for the purpose of large and distributed resource set needs to handle the overheads
10.Due to the boundary of the network, both physical resources and virtual services may not be reachable. The discovery model based on the network structure will not be effective. Specially for mobile aware provider and consumer applications and services, service transmission should continue with mobility. (1) Multi-cast Model (2)Directory Server Model (3) Hierarchical Directory Server Model which Directory Server located at each virtual logical boundary. The information aggregated all services in the Top level logical container. Requester unicasts the discovery queries to the server and queries will be forwarded the hierarchy. However, hierarchical model requires the deployment of the directory servers running in the upper and lower layer structure domains. The directory server in turns has to be configured in such hierarchical manner. But one thing for sure, applications in each local network does not require global network connection.
11.Flooding based unstructured P2P discovery model does not scale in terms of message overhead. Some proposed the optimized model to reduce the network traffic but increase the cost of query latency. DHT based system shows scalability and efficiency but can not handle complex query. Avoid overhead of resource discovery queries over the network, semantics-based P2P query forwarding strategy is valuable. Only forwarding query to semantically related nodes. RFD is used for resource and query expression. After the related node is identified, the original RDF query is applied to retrieve the designated information.

Sunday, March 12, 2006

Process and Address Space

Address space is the kernel abstraction of managing the memory
pages allocation for the process. Process needs memory address
space to store text instruction, data segment,heap as tmp process
space and stack

HW context: platform specific process execution environment
such as registers ------ CPU
address space ------------Memory

SW context: process credentials, open file lists, PIDs, signal disposition, signal handler, scheduling class and priority, process state,

Most of the process has stdin, stdout, stderr which define the source and destination
of input and output char streams for the process.

Solaris kernel, a process is composed of LWP and kthread. It is a kernel data strcution
linked to the process structure. This threading model seperate the user land threads with
kernel threads.

User land thread is created by routine call: thr_create(3T) or pthread_create(3T) from
libthread.so and libpthread.so. User thread has it's own scheduling and priority scheme
which different from kernel scheduling class and priority. It is done by calling into
thread library dispatcher and switch routines periodically at predefined preemption points.
User land thread is scheduled by linking on to a available LWP. It is required for a user
land thread to be executed. However, user thread is created does not mean LWP is created.
It must be specified as user thread to have kernel to create LWP from THR_NEW_LWP or THR_BOUND
flag. But for a threaded process with many bound LWPs with kernel thread will cause performance
impact.

In addition, thr_setconcurrency(3T) informs kernel that how may threads programmer wishes to run.
Without specification from code, thread library will maintain the reasonable number of LWPs for user land thread execution.

System should balance the case which there are too many LWPs or no enough LWPs to run.
Too many LWP cause kernel overhead to manage and too less LWPs cause many runnable user
land threads wait for resources for execution.

As traditional process modlel, exec or fork create new process.

In a multi thread process model, all HW context are not shared among user land threads
However, SW context such as address space, PIDs, credentials, signal hanlders etc. are shared.

ABI & ELF & Process Image

With ELF format within ABI standard, kernel and OS tools create
executable object which can be loaded into memory and created as
process for scheduling and execution.

As program becomes a binary executable object as it is complied
and linked with OS program language specific compiler. As the
executable object is exec(1), dynamic linking process starts with
the lib.so.1(1) is called to link with other shared objects from
libc.so.1 (dynamic link library) for instruction execution. Please
note that all references in the program are resolved via ld.so.1

However, static link can be achieved via -B static flag with compilation
which force all references are included at build time. But as dynamic
link process needs libc.so.1, static link process requires libc.a
Building 64 bit app can not be done with static link since there
is no static archieved libaray (libc.a) released with OS.

A program is compiled as ELF format executable object.
ELF defines the format for process on disk and in memory (process image)
ELF format is the part of ABI standard states the OS binary interface
for compiled and executed programs.

ELF addresses both platform independent and processor specific
specifications. For processor dependent ABI standards include the
function-calling squence (system calls, stack mgt etc) and OS
interface for signals, process initialization etc.

S10 PCB Structure

Two major abstraction of Solaris (Process and File)

(1) Process is basic unit of scheduling and execution on the Solaris
(2) Multi-threaded Process architecture: process, LWP and kernel thread
(3) Solaris kernel process model: procfs, signals,process group,session mgt
(4) Solaris kernel maintain system wide process table for PID and related data
(5) Solaris process abstraction includes traditional unix process model for
HW state and OS maintained SW context. Additionaly, supports multi-thread
execution within a single process. Each thread shares the process state
and can be scheduled and executed on a processor independently of other
threads within the same process
(6) Solaris invents Time Sharing scheduling policy and round robin approach
for process schediling scheme and alogrithm

In addition to Solaris unique process architecture design, a good tool is
been evolved for process monitoring. Procfs is the pseudo file system export
the Process Abstraction Model to user with a file system like interface for
extraction of process data.

Saturday, March 11, 2006

S10 Well Known Processes

(1) Memory scheduler

proc_sched

(2) init process

proc_init

(3) pageout deamon

proc_pageout

(4) fsflush

proc_fsflush

S10 zones vs IBM LPARs

- Platform Availability - Zones is SPARC, x86 and x64 - with
OpenSolaris, who knows what else, LPARs are extremely vendor specific

- Performance overhead - Zones offer 0 perf overhead for applications,
as there is no virtualization layer (ala hypervisor) that apps have to
punch through. The overall system overhead for Zones is minimal, due to
all the resource sharing. Contrast this architecturally against LPARs.

- Managebability - LPARs do nothing for manageability of the datacenter,
all they do is consolidate the hardware footprint. For a large %age of
apps, Zones resolve a large part of the management headache

- Obserability - this is *key*. If an app in a LPAR is not behaving,
there is no way for someone inside that OS instance to see whats going
on around itself. You cant call someone up who can check the entire
platform to try and diagnose the problem. With Zones, the global zone
admin has full visibility into all the local zones, and into the entire
hardware platform, no virtualization.

Friday, March 10, 2006

MN Discovery and IP routing

1. CN sends a IP packet with as specific MN as destination address and CN as source address
2. CN router does know where MN is but just route IP packet to the router which is responsible for MN's HN
3. HA intercept the packets, HA knows that MN is not at HN now and forward the packet into the subnet with encapsulated and tunnelled to the COA. New header in front of old IP address will show
that HA is the source and COA is the target address
4. FA stract the orginal packet and send the data to MN with CN as the source

MN received the packet as it was from CN to MN directly

Donwlink data flow would be simple as MN send packet to CN in case of CN is not mobile

In case of mobile CN, 1-4 will be repeated