Thursday, August 30, 2007

SOA Enable C+

Many WS-enabling tools available on the market for C++, such as gSOAP, OSS, etc. Other options, especially if you use Java CAPS, are: TCP/IP, HTTP with non-SOAP payload, and JMS. SGF is always an option. Wrapping C++ with Java WS is not a very good idea.


The combination of gSOAP and JMS for integration with C++ and had a successful production deployment

Wednesday, August 29, 2007

Avoid of sysemt call time() on Solaris

Some product heavily uses times() syscall to measure cpu time and system time.
I'd like to get rid of any system call overhead if possible.

http://developers.sun.com/solaris/articles/time_stamp.html

This article shows optimization the performance of enterprise systems that employ extensive time-stamping using the time(2)

File Descriptor and Solaris Sockets

rlim_fd_max is the same as rlim_fd_curr then this means that they are only allocating 256 file descriptors. Sett the rlim_fd_max to at least 2048 or higher for highly loaded system

Friday, August 17, 2007

JMS design and analysis

(1) Design Pattern: Publish Subscription

1 Producer -----(1 Message) ---> Topic (Destination) ----> K Consumers
K Producers ----(K Messages) ---> Topic (Destination) ---> K Consumers

(2) Service Reliability: Message Reliability:

With normal topic subscription, only active consumers subscribed
to the topic will receive messages.

With durable subscriptions, the inactive consumers are ensured
to receive the message as subsequently becomes active.

Intuitively, the topic does not hold the messages it receives
unless it has inactive consumers with durable subscriptions.

Hence, durable subscription is the service reliability practice

(3) Best Practice-1: Development with Unified Domain Model

Even though the domain-specific interfaces (Queue, Topic)
are backward supported for legacy purposes, The best development
practice is to use the Unified Domain Interface,
which transparently support P-P, Pub-Sub Models.


(4) Best Practice-2: Use Administered Object Store

The connection factory is recommended to create and reconfigured
with the administration tools. Admin objects such as ConnectionFactory
is then placed in an Administered Object Store. It decouples JMS application
code and portable among different JMS providers.



(5) Best Practice-3: J2EE Client By Resource Annotation

Use @Resource and no exception handing in J2EE 1.5 as best
practices.

(5) Code Segment: to show message life cycle

Use Unified Domain createConnection method is in javax.jms.ConnectionFactory

Hashtable env = new Hashtable();
env.put(Context.INITIAL_CONTEXT_FACTORY,
"com.sun.jndi.fscontext.RefFSContextFactory");
env.put(Context.PROVIDER_URL, "file:///amberroad:/sun_mq_admin_objects");
Context ctx = new InitialContext(env);
String CF_LOOKUP_NAME = "ARConnectionFactory";
ConnectionFactory arFactory = (ConnectionFactory) ctx.lookup
(CF_LOOKUP_NAME);
String DEST_LOOKUP_NAME = "ARTopic";
Destination arTopic = (Destination) ctx.lookup(DEST_LOOKUP_NAME);



Connection arConnection = null;
Session arsession = null;
try {
arConnection = myFactory.createConnection("amberroad", "amberroad");
arConnection.setExceptionListener(this);
arsession = arConnection.createSession(false, Session.AUTO_ACKNOWLEDGE);
MessageProducer arProducer = arSession.createProducer(arTopic);
String arSelector = "/* Text of selector here */";
MessageConsumer
arConsumer = arSession.createDurableSubscriber(arTopic, "arConsumer", arSelector, true);
TextMessage arMsg = mySession.createTextMessage();
arMsg.setText("AmberRoad Test");
// at message level specification
arMsg.setJMSReplyTo(replyDest);
arMsg.setJMSDeliveryMode(DeliveryMode.PERSISTENT);
arMsg.setDisableMessageTimestamp(9L);
arMsg.setJMSPriority(Message.DEFAULT_PRIORITY);
arMsg.setJMSExpiration(1000L);
arProducer.send(arTopic, arMsg);

// if it is not at message level, at producer level specification

arProducer.send(arDest, arMsg, DeliveryMode.PERSISTENT, 9, 1000);


// if set up async listener

ARMessageListener arListener = new ARMessageListener();
arConsumer.setMessageListener(arListener);


// blocking receiver
arConnection.start();
Message inMsg = arConsumer.receiveNoWait(); // distributed apps

outMsg.clearBody();// consumer does
--------------------

arConnection.stop(); // in case to suspend messaging

finally{

mySession.unsubscribe("arConsumer");
arConnection.close();

}

Note


(1) for the publish/subscribe domain to create durable topic subscriptions,
a client identifier arrangement is done by configuring the client runtime
to provide a unique client identifier automatically for each JMS application.

(2) For message consumption, with auto-acknowledge mode, the Message Queue
client runtime immediately sends a client acknowledgment for each message
it delivers to the message consumer; it then blocks waiting for a return
broker acknowledgment confirming that the broker has received the client
acknowledgment. It leaves JMS application code free.


(3) For message producers, the broker’s acknowledgment behavior
depends on the message’s delivery mode defined in Message Header.
The broker acknowledges the receipt of persistent messages for persistent messages
but not of non persistent ones; It is not configurable by the client.




For Receiving Messages Asynchronously



public class ARMessageListener implements MessageListener
{
public void onMessage (Message inMsg) throws JMSException
{
Destination replyDest = inMsg.getJMSReplyTo();
long timeStamp = inMsg.getLongProperty("JMSXRcvTimestamp");
Enumeration propNames = inMsg.getPropertyNames();
String eachName;
Object eachValue;

while ( propNames.hasMoreElements() )
{
eachName = propNames.nextElement();
eachValue = inMsg.getObjectProperty(eachName);

}
String textBody = inMsg.getText();

}
}

Thursday, August 16, 2007

Weka running for Linux and Unix

(1) Under package weka.gui

LookAndFeel.props

(2) uncomment the first configuration

# Look'n'Feel configuration file
# $Revision: 1.1.2.2 $

# the theme to use, none specified or empty means the system default one
Theme=javax.swing.plaf.metal.MetalLookAndFeel
#Theme=com.sun.java.swing.plaf.gtk.GTKLookAndFeel
#Theme=com.sun.java.swing.plaf.motif.MotifLookAndFeel
#Theme=com.sun.java.swing.plaf.windows.WindowsLookAndFeel
#Theme=com.sun.java.swing.plaf.windows.WindowsClassicLookAndFeel

Wednesday, August 15, 2007

Classification Time Complexity

NBTree is O(n^3)
Decision trees is O(n^2)
Naive Bayes is O(n).

Sunday, August 12, 2007

NBTree Algorithmic Time Complexity

NBTree uses cross-validation at each node level to
make a decision to split or construct a naive Bayes
model.

Friday, August 10, 2007

Curse of dynamic programming and stochastic control

(1) Curse of parameteric approximation to cost-to-go function
(2) modeling without closed form objective function

function approximation, iterative optimization, neural network learning, dynamic programming

Curse of dynamic programming and stochastic control

(1) Curse of parameteric approximation to cost-to-go function
(2) modeling without closed form objective function

Monday, August 06, 2007

Attribute Selection With Weka

Uing the tab "Select attributes" in Weka to do some
wrapper-based feature subset selection. I encounter several different
search methods.

Greedy stepwise with parameters
"conservativeForwardSelection" = False
"searchBackwards" = False

It does forward selection starting from an empty set of
attributes. It stops adding attributes as soon as there
is no single addition that improves apon the current
best subset's merit score.


Greedy stepwise with parameter
"conservativeForwardSelection" = True, and
"searchBackwards" = False

It does the same and it will continue to add new features as
long as it does not decrease the merit of the current best subset.


BestFirst with parameter "direction" = Forward
BestFirst is a beam search. It allows backtracking to
explore other promising search paths.

The "More" button in the GenericObjectEditor
when selecting/altering parameters for
search methods in the Explorer.

Thursday, August 02, 2007

Zone and CPU shares

$ pooladm -e
$ pooladm -s
$ pooladm -c

$ poolcfg -c 'create pset pset1 (uint.min = 2 ; uint.max = 2)'
$ poolcfg -c 'create pset pset2 (uint.min = 1 ; uint.max = 1)'

$ poolcfg -c 'create pool pool1'
$ poolcfg -c 'create pool pool2'

$ poolcfg -c 'associate pool pool1 (pset pset1)'
$ poolcfg -c 'associate pool pool2 (pset pset2)'

$ pooladm -c

Assuming, the zones are up & running:
$ poolbind -p pool1 -i machine1
$ poolbind -p pool2 -i machine2
$ poolbind -p pool2 -i machine3
$ poolbind -p pool2 -i machine4
$ poolbind -p pool2 -i machine5

To make the bindings persistent, use
$ zonecfg -z set pool=

Sun Cluster supports Zone

SC3.2 does support treating a zone as a cluster node. See http://docs.sun.com/app/docs/doc/819-6611/6n8k5u1mc?a=view#gdamq

Zone memory limits and SWAP issues

There is no memory limit for Zones. A zone's processes are allowed to
use as much RAM and swap as they want. Resource controls can do this

Fork calls fail when there isn't enough swap space. It returns with error
ENOMEM, which is 12. The second failure in your log below returned a status
of 12.

Sample Size and Dimensionality

Sample size and dimensionality are critical to parametric optimization
of machine learning and prediction. Small datasets with high
dimensionality poses the low ROC problem in research community.

A naive Bayes classifier (Maron, 1961) is a simple probabilistic
classifier based on applying Bayes’ theorem with strong independence
assumptions. Depending on the precise nature of the probability model,
naive Bayes classifiers can be trained very
efficiently in a supervised learning setting. In many practical
applications, parameter estimation for naive Bayes models uses the
method of maximum likelihood. Recent researches on Bayesian
classification problem has shown that there are some theoretical reasons
for the apparently unreasonable efficacy of naive Bayes
classifiers(Zhang, 2004). Because independent variables are assumed,
only the variances of the variables for each class need to be determined
and not the entire covariance matrix. Hence, Naive Bayes classifier
requires small training data for classification prediction.

Support vector machines (SVMs) is another set of supervised learning
methods for classification (Cortes & Vapnik, 1995). It maps input
vectors to a higher dimensional space where a maximal separating
hyperplane is created. Two parallel hyper-planes
are constructed on each side of the hyperplane that separates samples.
The separating hyperplane is the hyperplane that maximizes the distance
between the two parallel hyper-planes. The larger the margin or distance
between these parallel hyper-planes is, The better the generalization
error of the classifier will be. It requires large samples.

TCP monitoring

(1) List all tcp tunnables

ndd /dev/tcp \?

(2) get a tcp tunnable values

ndd -get /dev/tcp tcp_conn_req_max_q0

(3) set a tcp tunnable

ndd -set /dev/tcp tcp_conn_req_max_q0