Saturday, September 29, 2007

How to define a set of values

BNF is the way to define a set of values. This set of rule calls grammar ---- BNF grammar defines a set of values.

::= ()
::= (.)

Thursday, September 27, 2007

ZFS and DS

MODS/WRITES with ZFS compression
It would seem that using ZFS with compression could theoretically yield some amazing write performance but I haven't done any testing myself yet. Any figures or observations here?

CACHE COMPRESSION?
I'm not sure what they meant there because I have seen that ZFS uses ARC for relinquishing memory to cache However, the memory relinquishing algorithm was less than desirable for database apps . But cache compression was to minimize the memory footprint of the DS entries within the FS cache which sounds great, if it were possible, for large databases.

Lamda Calculus

lambda calculus, also λ-calculus, is a formal system designed to investigate (1)function definition, (2)function application (3) recursion.

Lambda calculus can be used to define (1) what a computable function is. The question of whether two lambda calculus expressions are equivalent cannot be solved by a general algorithm. This was the first question, even before the (2) halting problem for which undecidability could be proved.

Lambda calculus can be called the smallest universal programming language. It consists of a single

(1)transformation rule -----variable substitution

(2) function definition scheme.

Lambda calculus is universal in the sense that any computable function can be expressed and evaluated using this formalism. It is thus equivalent to the Turning Machine formalism.

Lambda calculus emphasizes the use of transformation rules(variable substitution) and does not care about the actual machine implementing them. It is an approach more related to software than to hardware.


Smba wit Solaris Zone

touch /etc/krb5.keytab

the smb.conf file was not configured to use kerberos keytab but it will try to.

Storage Data Classification & Storage Policies

Archive solution is designed for tape and disk storage devices. NAS package provides the file level representation of data. Storage Management System requires data classification and policies.

Data classification is done based on the file system and a few other attributes. The attributes include file name, type, ownership, size, access rights and age. Classification can done automatically using customer defined categories.

The customer defined data classification categories are assigned to storage management policies. Policies control the placement of the files onto the various types of physical storage, replication, deletion, and access.

DS replication and search

Enabling or disabling the replication should not impact the search

Oracle Database and Solaris Zone

The Oracle data base product line comes in two significantly different forms:

1) Oracle has a single machine version of their data base product. It supports zone deployment today.

2) Oracle has version of their data base product that works cooperatively and concurrently on multiple nodes. This product is called Oracle RAC. Oracle RAC does not run in zones now. But it
will do so soon

POSIX Thread, Select & Poll

The model of simply using one thread per connection, while simpler to code and support,
is viable for only a few to several hundred (possibly a few thousand) threads.

From a performance perspective, select(3C) has stated in the man page (from circa Solaris 2.6 days IIRC) "poll(2) function is preferred over this function." select(3C) in Solaris is very inefficient as it has to translate in user space from the bitfields and calls poll() to do the kernel work anyway. Not to mention that recompiling an application that uses select(3C) as a 64 bit binary can result in a 500% performance drop. This occurs if they do not modify FD_SETSIZE, as this goes from 1024 in 32 bit compiles to 65536 in 64 bit compiles. So avoid select like the plague if performance is at an issue for your application. Select has been implemented as a direct syscall in several other Unix versions so code calling select(3C) will likely underperform on Solaris compared to other OSs.

Even poll(2) suffers a lot of unnecessary performance degredation issues so, if dealing with thousands of descriptors, the far more scalable /dev/poll "man -s 7d poll" or Event Completion
Framework should be used. Here's an article wrote back in 2002 comparing scalability of
poll(2) and /dev/poll :

http://developers.sun.com/solaris/articles/polling_efficient.html

And it seems /dev/poll has been considered a bit too complicated for a lot of real world uses, and also suffers some performance issues in cases of quickly changing lists of file descriptors to
monitor, so the even newer Event Completion Framework/Ports have been created to deal with both issues :

http://developers.sun.com/solaris/articles/event_completion.html
http://partneradvantage.sun.com/protected/solaris10/adoptionkit/tech/tecf.html

Both /dev/poll and Event Ports have been ported to Linux, so, whether they are POSIX standards, extentions, or just de facto standards since they are the only tools to do the job if the state of many thousands of connections must be polled, they must be considered.

SAN Disk/ Tap backup with Solaris Container

Several systems are set up as SAN media servers so that backup to tapes occurs from SAN-disk to SAN-tape rather than across a network.

The simplest approach is for each non-global zone to send its backup stream to the media server. It avoids potential security issues which must be addressed if you backup the zones' file systems from the GZ.

DS database mini cache size heristrics

dbase cache at least enough to hold the indexes. It is true that during search DS first read the indexes then the entries, but I would not make any assumption about what it means for the database cache. Usually when coding a cache algorithm. It is true for the entry cache and the changelog cache, but I do not know if it is also true for the DB cache.

When there is not enough space to add some data, the data that was not used since the oldest time are removed and the new one replace it. LRU, MRU.

So if you have not enough data for both kind of data, you will rather ends up with the most frequently used indexes and entries pages staying in the cache while the others go in and out
IMHO, There is probably no "priority" about indexes versus entries ...

In modify things are much more complex: dn2entryid index is read first then entry then all the indexes are read.

System Call vs. POSIX Thread on Select & Poll

Solaris System calls select() (BSD) poll() (AT&T) are _not_ POSIX. BSD and AT&T unix used to have their own ways on this. POSIX threads is recommended than relying on select() or poll() if you really want to write POSIX compliant code. With POSIX threads you should be able to evade the need for using these two system calls. Alternatively you could supply an extra .c file which contains the call to the poll() system call; this file must be translated with the _POSIX_C_SOURCE macro undefined. (You will be calling poll() even if you call select(). As Solaris 2.x is derived from AT&T, poll(2) is used to implement select(3C)).

Thursday, September 06, 2007

XWindows on Solaris 10

It requires FQ hostname in case of VPN client access

(1) ssh -X -l
on remote host directly execute UI application

ssh -X forward x11, there is no need to export DISPLAY any more


(2) vnc

run vncserver on the remote server as

vncserver :1
may propmpt password if you do not have it now

on the local client

vncviewer :1
type in password