Thursday, March 29, 2012

Operating Systems Day 02

The CPU (Central Processing Unit) executes instructions. The instructions operate on data that are stored in registers.

Instructions and data are both fetched from storage.

The CPU executes instructions that are also represented in binary form; and these instructions are fetched from storage just like data.

MAR: Memory Address Register: stores the address of the fetch

MDR: Memory Data Register (a.k.a. Memory Buffer Register MBR): stores the data of the fetch

CPU registers are storage locations in the system that the processor CPU uses to do its calculations and manage the system.

PC: Program Counter, stores the address of the next instruction

Instruction Decoder: interprets the instruction

A computer program is a series of instructions, (usually) stored in memory. When running a program, the processor starts at the beginning of the program, executing instructions using a process known as the instruction cycle:

Fetch: the next instruction to run is fetched from memory (i.e. the CPU fetches instructions from memory according to the value of the PC). In this Fetch cycle, the address of the next instruction will be loaded to PC. Then the content of the PC will be loaded to MAR. Instruction and data in the memory required will be loaded to MDR. The MAR content will be loaded to CIR (Current Instruction Register).  The contents of the CIR will be sent to Instruction Decoder (i.e. Control Unit will look in the Instruction Set to see what needs to be done to execute a "LOAD" statement, for example) to be decoded. Next, PC will be updated with the address of the next instruction in memory.

Decode: the instruction information is decoded and may cause operands (data) to be fetched from memory. In this Decode cycle, if operands (data) need to be fetched from memory, the MAR will be updated so that the correct address of the operands (data) in memory can be retrieved.

Execute: the instruction is run and results may be stored back in memory.

References
McHoes, A., & Flynn, I.M. (2008). Understanding operating systems (5th ed.). CENGAGE Learning.
Fernandez, G., BN103/BN103D Platform Technologies Lecture Notes, Melbourne Institute of Technology, 2012.

Monday, March 26, 2012

Artificial neural networks Day 04

The validation and test curves are very similar. It means that there is no problem with the training. If the test curve had increased significantly before the validation curve increased, then it is possible that some overfitting might have occurred.

A regression plot shows the relationship between the outputs of the network and the targets. If the training were perfect, the network outputs and the targets would be exactly equal, but the relationship is rarely perfect in practice.

When using a gradient descent algorithm, you typically use a smaller learning rate for batch mode training than incremental training, because all the individual gradients are summed before determining the step change to the weights.

When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training set, which is used for computing the gradient and updating the network weights and biases.

The second subset is the validation set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error.

The test set error is not used during training, but it is used to compare different models. It is also useful to plot the test set error during the training process. If the error on the test set reaches a minimum at a significantly different iteration number than the validation set error, this might indicate a poor division of the data set.

In incremental mode, the gradient is computed and the weights are updated after each input is applied to the network. In batch mode, all the inputs in the training set are applied to the network before the weights are updated.

For most problems, when using the Neural Network Toolbox™ software, batch training is significantly faster and produces smaller errors than incremental training.

During training, the progress is constantly updated in the training window. Of most interest are the performance, the magnitude of the gradient of performance and the number of validation checks. The magnitude of the gradient and the number of validation checks are used to terminate the training. The gradient will become very small as the training reaches a minimum of the performance. If the magnitude of the gradient is less than 1e-5, the training will stop. This limit can be adjusted by setting the parameter net.trainParam.min_grad. The number of validation checks represents the number of successive iterations that the validation performance fails to decrease. If this number reaches 6 (the default value), the training will stop.

Parameter Stopping Criteria
min_grad Minimum Gradient Magnitude
max_fail Maximum Number of Validation Increases
time Maximum Training Time
goal Minimum Performance Value
epochs Maximum Number of Training Epochs (Iterations)

References

MATLAB 2011b Help Documentation

Saturday, March 24, 2012

Python programming Day 04


#Return the content of the web page
#E.g. print get_page('http://www.monash.edu')
import urllib
def get_page(url):
    try:
        return urllib.urlopen(url).read()
    except:
        return ""

Thursday, March 22, 2012

Python programming Day 03


#Define a procedure, product_list,
#takes as input a list of numbers,
#and returns a number that is
#the result of multiplying all
#those numbers together.

#product_list([9]) => 9
#product_list([1,2,3,4]) => 24

def product_list(my_list):
    if len(my_list)==1:
        return my_list[0]
    else:
        product=my_list[0]
        for i in range(len(my_list)-1):
            product=product*my_list[i+1]
    return product

Operating Systems Day 01

Figure below shows four subsystem managers supporting the User Command Interface.


Figure below shows the network operating system



Figure below shows the current design of operating systems.


Figure below shows a fix-head disk.


Figure below shows a movable-head disk.



Figure below shows a magnetic disk.



Figure below shows an optical disk.



The CPU (Central Processing Unit) executes instructions that operate on data in registers.

The cycle is Fetch-Decode-Execute.

Modern desktop computers clocks can tick in excess of 3 billion times per second.

File Manager: the section of the operating system responsible for controlling the use of files.

Kernel: the primary part of the operating system that remains in random access memory (RAM) and is charged with performing the system’s most essential tasks, such as managing main memory and disk access. It resides in memory at all times, performs essential tasks, and protected by hardware.

Operating system: the software that manages all the resources of a computer system.

Processor Manager: a composite of two submanagers, the Job Scheduler and the Process Scheduler, which decides how to allocate the CPU.

Software: a collection of programs used to perform certain tasks. Software falls into three main categories: operating system programs, compilers and assemblers, and application programs.

Thread: a portion of a program that can run independently of other portions. Multithreaded applications programs can have several threads running at one time with the same or different priorities.

Virtualization: the creation of a virtual version of hardware or software. Operating system virtualization allows a single CPU to run multiple operating system images at the same time.

References
McHoes, A., & Flynn, I.M. (2008). Understanding operating systems (5th ed.). CENGAGE Learning.

Monday, March 19, 2012

Artificial neural networks Day 03

The multilayer feedforward neural network is the workhorse of the Neural Network Toolbox software. It can be used for both function fitting and pattern recognition problems. With the addition of a tapped delay line, it can also be used for prediction problems (see Focused Time-Delay Neural Network (timedelaynet)).

Feedforward networks often have one or more hidden layers of sigmoid neurons followed by an output layer of linear neurons. Multiple layers of neurons with nonlinear transfer functions allow the network to learn nonlinear relationships between input and output vectors. The linear output layer is most often used for function fitting (or nonlinear regression) problems.

On the other hand, if you want to constrain the outputs of a network (such as between 0 and 1), then the output layer should use a sigmoid transfer function (such as logsig). This is the case when the network is used for pattern recognition problems (in which a decision is being made by the network).

The work flow for the general neural network design process has seven primary steps:

  1. Collect data
  2. Create the network
  3. Configure the network
  4. Initialize the weights and biases
  5. Train the network
  6. Validate the network (post-training analysis)
  7. Use the network

The following steps demonstrate how to solve a fitting problem using a feedforward neural network.

1a. Define the input e.g. load house_dataset; % This house_dataset includes the input and target called houseInputs and houseTargets, respectively.
1b. Define the target % houseTargets as the target
2. Define the network architecture i.e. net=feedforwardnet;
3. Configure the network i.e. net=configure(net,houseInputs,houseTargets); % Initialize the weights and biases (you can initialise them again by using net=init(net).
4. Initialize the weights and biases if needed.
5. Train the network i.e. [net,tr]= train(net, houseInputs, houseTargets); %Command train initilise weights and biases as well. The configure command is not used here. 'tr' is the training record.
6. Validate the network i.e. plotperf(tr) % Plot the performance progress
7. Use the network i.e. a = net(houseInputs(:,5)) % Find the output for the input vector at column 5.
OR a = net(houseInputs); % Print all the outputs for the input 'houseInputs'.

In particular, to generate some sample code to reproduce the function fitting examples shown above, you can run the neural fitting GUI, nftool. Select the house pricing data from the GUI, and after you have trained the network, click the Advanced Script button on the final pane of the GUI.

If you are interested in using a multilayer neural network for pattern recognition, use the pattern recognition GUI, nprtool. It will lead you through a similar set of design steps for pattern recognition problems, and can then generate example code demonstrating the many options that are available for pattern recognition networks.

You would normally use Levenberg-Marquardt training for small and medium size networks, if you have enough memory available. If memory is a problem, then there are a variety of other fast algorithms available. For large networks you will probably want to use trainscg or trainrp.

There could be three different error surfaces for a multilayer network. The problem is that nonlinear transfer functions in multilayer networks introduce many local minima in the error surface. As gradient descent is performed on the error surface, depending on the initial starting conditions, it is possible for the network solution to become trapped in one of these local minima. Although a multilayer backpropagation network with enough neurons can implement just about any function, backpropagation does not always find the correct weights for the optimum solution.

Networks are also sensitive to the number of neurons in their hidden layers. Too few neurons can lead to underfitting. Too many neurons can contribute to overfitting, in which all training points are well fitted, but the fitting curve oscillates wildly between these points.

In multilayer networks, sigmoid transfer functions are generally used in the hidden layers. These functions become essentially saturated when the net input is greater than three (exp (−3) ≅ 0.05). If this happens at the beginning of the training process, the gradients will be very small, and the network training will be very slow. In the first layer of the network, the net input is a product of the input times the weight plus the bias. If the input is very large, then the weight must be very small in order to prevent the transfer function from becoming saturated. It is standard practice to normalize the inputs before applying them to the network.

Generally, the normalization step is applied to both the input vectors and the target vectors in the data set. In this way, the network output always falls into a normalized range. The network output can then be reverse transformed back into the units of the original target data when the network is put to use in the field.

Most of the network creation functions in the toolbox, including the multilayer network creation functions, such as feedforwardnet, automatically assign processing functions to your network inputs and outputs. These functions transform the input and target values you provide into values that are better suited for network training.

When training multilayer networks, the general practice is to first divide the data into three subsets. The first subset is the training  set, which is used for computing the gradient and updating the network weights and biases. The second subset is the validation  set. The error on the validation set is monitored during the training process. The validation error normally decreases during the initial phase of training, as does the training set error. However, when the network begins to overfit the data, the error on the validation set typically begins to rise. The network weights and biases are saved at the minimum of the validation set error.

Mean Squared Error is the average squared difference between outputs and targets. Lower values are better. Zero means no error.

Regression R values measure the correlation between outputs and targets. An R value of 1 means a close relationship, 0 a random relationship.

Regression for the output? How to compute?
What is training state?
What are nonlinear regression problems?

References
MATLAB 2011b Help Documentation

Neighborhood processing using MATLAB

Correlation operation: correlation is the same as convolution without the mirroring (flipping) of the mask before the sums of products are computed.

Convolution operation: convolution is a widely used mathematical operator that processes an image by computing—for each pixel—a weighted sum of the values of that pixel and its neighbors.

Convolution with masks is a very versatile image processing method. Depending on the choice of mask coefficients, entirely different results can be obtained, for example, image blurring, image sharpening, or edge detection.

Linear Filters: Here the resulting output pixel is computed as a sum of products of the pixel values and mask coefficients in the pixel’s neighborhood in the original image. E.g.: mean filter.

Nonlinear Filters: Here the resulting output pixel is selected from an ordered (ranked) sequence of pixel values in the pixel’s neighborhood in the original image. E.g.: median filter, the max and min filters.

%Perform convolution operation using MATLAB
>> a = [0 0 0 1 0 0 0]; % signal

>> f = [1 2 3 4 5]; % filter
>> g = imfilter(a,f,'full','conv')
g =
     0     0     0     1     2     3     4     5     0     0     0
>> g = imfilter(a,f,'same','conv')

g =
     0     1     2     3     4     5     0

%Perform correlation operation on the same filter
>> h = imfilter(a,f,'full','corr')

h =
     0     0     0     5     4     3     2     1     0     0     0

%Perform correlation operation on the 2D signal

>> x = [140 108 94;89 99 125;121 134 221] % signal
x =
   140   108    94
    89    99   125
   121   134   221

>> y = [-1 0 1;-2 0 2;-1 0 1] % filter mask
y =
    -1     0     1
    -2     0     2
    -1     0     1

>> z = imfilter(x,y,'corr')
z =
   315   -56  -315
   440   126  -440
   367   236  -367

To perform convolution,weuse the same technique as in correlation. The difference here is that the filter matrix is rotated 180^o before performing the sum of products.

>> z2 = imfilter(x,y,'conv')
z2 =
  -315    56   315
  -440  -126   440
  -367  -236   367

%Generate a mean (average) filter

>> fn = fspecial('average')
fn =
    0.1111    0.1111    0.1111
    0.1111    0.1111    0.1111
    0.1111    0.1111    0.1111

%Apply the mean filter to the original image

>> I = imread('cameraman.tif');
>> figure, subplot(1,2,1), imshow(I), title('Original Image');

>> I_new = imfilter(I,fn);
>> subplot(1,2,2), imshow(I_new), title('Filtered Image');

The below figure shows the effects of the mean filter operation.


%Generate non-uniform mean filter
>> fn2 = [1 2 1; 2 4 2; 1 2 1]

fn2 =
     1     2     1
     2     4     2
     1     2     1
>> fn2 = fn2 * (1/16)
fn2 =

    0.0625    0.1250    0.0625
    0.1250    0.2500    0.1250
    0.0625    0.1250    0.0625

%Apply the non-uniform mean filter to the original image to compare with the uniform one.
>> I_new2 = imfilter(I,fn2);
>> figure, subplot(1,2,1), imshow(I_new), title('Uniform Average');
>> subplot(1,2,2), imshow(I_new2), title('Non-uniform Average');

The below figures show the difference in subjective quality evaluation between two outputs.


The Gaussian filter is similar to the nonuniform averaging filter in that the coefficients are not equivalent. The coefficient values, however, are not a function of their distance from the center pixel, but instead are modeled from the Gaussian curve.

%Generate the Gaussian filter and draw the filter as 3D graph

>> fn_gau = fspecial('gaussian',9,1.5);
>> figure, bar3(fn_gau,'b'), title(Gaussian filter as a 3D graph');

%Apply the Gaussian filter to the original image

>>I_new3 = imfilter(I,fn_gau);
>> figure
subplot(1,3,1), imshow(I), title('Original Image');
subplot(1,3,2), imshow(I_new), title('Average Filter');
subplot(1,3,3), imshow(I_new3), title('Gaussian Filter');

The below figure shows the subjective quality difference between two filtered images.


References

  1. Oge Marques, Practical Image and Video Processing Using MATLAB, Wiley-IEEE Press, September 2011.

Saturday, March 17, 2012

Python programming Day 02

#Compute median number among the three numbers
def bigger(a,b):
    if a > b:
        return a
    else:
        return b
     
def biggest(a,b,c):
    return bigger(a,bigger(b,c))

def median(a,b,c):
    biggest_number=biggest(a,b,c)
    if a==biggest_number:
        median_number=bigger(b,c)
    else:
        if b==biggest_number:
            median_number=bigger(a,c)
        else:
            median_number=bigger(a,b)
    return median_number
 
print median(4,5,6)

#Print numbers n,n-1,n-2,...,1, Blastoff
def countdown(n):
    while n >= 1:
        print n
        n=n-1
    print 'Blastoff'
 
countdown(9)

#Find the last position of a target string in the search string. If there are no occurrences output is -1.

def find_last(s,t):
    post=0
    result=s.find(t,post)
    if result == -1:
        return -1
    else:
        while result != -1:
            post=result+1
            result=s.find(t,post)
        return post-1

find_last('abc','a') # Output will be 0.

#Print the multiplication table
def print_multiplication_table(n):
    p=1
    while p<=n:
m=1
        while m <=n:
            result=m*p
            print str(p) + '*' + str(m) + '=' + str(result)
            m=m+1
        p=p+1

print_multiplication_table(10)

#To create a list, use
my_list=['a','b','d']
print my_list[0] #This will print a

#To create a tuple, use
my_tuple=('x','y')

#An example of accessing a nested list

countries = [['China','Beijing',1350],
             ['India','Delhi',1210],
             ['Romania','Bucharest',21],
             ['United States','Washington',307]]

#Write code to print out the capital of India by accessing the array.
print countries[1][1]

# List operations
my_list=[1,2]
my_list.append(3)
my_list=my_list + [4,5]
len(my_list) # This results 5

#List operations using append method
p=[1,2]

q=[3,4]
p.append(q)
print p
[1, 2, [3, 4]]

#Using a FOR loop to print out number of days in each month
days_in_month=[31,28,31,30,31,30,31,31,30,31,30,31]
months=[1,2,3,4,5,6,7,8,9,10,11,12]
for i in months:
     print days_in_month[i-1]

#Using a FOR loop to print the sum of the list

def sum_list(my_list):
    sum=0
    for e in my_list:
        sum=sum+e
    return sum
 
print sum_list([1,7,4,8,9])

#The following function takes a list of strings and returns a new list that contains capitalized strings.
def capitalize_all(t):
    result = []
    for s in t:
        result.append(s.capitalize())
    return result

#Several ways to delete elements in the list


#Define a procedure, measure_udacity, that takes its input a list of Strings,
#and outputs a number that is a count of the number of elements in the input
#list that start with the letter 'U' (uppercase).

#measure_udacity(['Dave','Sebastian','Katy']) => 0

#measure_udacity(['Umika','Umberto']) => 2
def measure_udacity(my_list):
    count=0
    for i in range(len(my_list)):
        if my_list[i][0]=='U':
            count=count+1
    return count

print measure_udacity(['Dave','Sebastian','Katy'])

#Another way to do it

def measure_udacity(my_list):
    count=0
    for e in my_list:
        if e[0]=='U':
            count=count+1
    return count

print measure_udacity(['Dave','Sebastian','Katy'])

#Define a procedure, find_element, that takes as its inputs a List and a value of any type, and
#outputs the index of the first element in the input list that matches the value.

#If there is no matching element, output -1.
#find_element([1,2,3],3) => 2
#find_element(['alpha','beta'],'gamma') => -1

def find_element(my_list, t):
    index=0
    for e in my_list:
        if e == t:
            return index
        index=index + 1
    return -1
 
print find_element([1,2,5,3,6,3],3)
print find_element(['alpha','beta'],'gamma')    

#Another way to do it using the 'index' method and 'in' operator
def find_element(my_list,t):
    if t not in my_list:
        return -1
    else:
        return my_list.index(t)

print find_element([1,2,3],3)
print find_element(['alpha','beta'],'gamma')

If we need to return the index for the output, a WHILE loop is a good choice.

>>> p=(0,1,2) #p is a tuple
>>> print p
(0, 1, 2)
>>> q=[0,1,2] #q is a list
>>> print q
[0, 1, 2]

<List>.pop()
Mutates <List> by removing its last element. Outputs the value of that element. If there are no elements in <List>, [].pop() produces an error.

# After running the procedure proc1(p) the value of p is unchanged.
def proc1(p):
      p=p+[1]

#Define a procedure, greatest, that takes as input a list of positive numbers, and
#returns the greatest number in that list. If the input list is empty, the output should be 0.

#greatest([4,23,1]) => 23
#greatest([]) => 0

def greatest(my_list):
    max=0
    if len(my_list)==0:
        return 0
    else:
        for e in my_list:
            if e >= max:
                max=e
        return max

References
Allen B. Downey, Think Python: How to Think Like a Computer Scientist, http://www.greenteapress.com/thinkpython/html/index.html
Udacity, CS101 Lecture Notes, February-April 2012

Thursday, March 15, 2012

CRM and data mining Day 06

What is views?

A view is a stored database query that provides a database user with a customized subset of the data from one or more tables in the database. Said another way, a view is a virtual table because it looks like a table and for the most part behaves like a table, yet it stores no data (only the defining query is stored).


Views serve a number of useful functions:

  1. Hiding columns that the user does not need to see (or should not be allowed to see)
  2. Hiding rows from tables that a user does not need to see (or should not be allowed to see)
  3. Hiding complex database operations such as table joins
  4. Improving query performance (in some RDBMSs, such as Microsoft SQL Server)


How to create views in SQL Server?


Could you find out which objects (tables/dimensions) a fact table depend on?


References
Andy Oppel (2011), Database Demystified, 2nd Ed, McGraw-Hill.

CRM and data mining Day 05

Database normalization process.


An invoice design example.




The invoice is represented in the tabular form.


The invoice without multivalued attributes.


First normal form: eliminating repeating data. A relation is said to be in first normal form when it contains no multivalued attributes. That is, every intersection of a row and column in the relation must contain at most one data value (saying “at most” allows for missing or null values).

Sometimes, we will find a group of attributes that repeat together, as with the line items on the invoice.

To transform unnormalized relations into first normal form, we must move multivalued attributes and repeating groups to new relations.

Because a repeating group is a set of attributes that repeat together, all attributes in a repeating group should be moved to the same new relation.

However, a multivalued attribute (individual attributes that have multiple values) should be moved to its own new relation rather than combined with other multivalued attributes in the new relation.

Second normal form: eliminating partial dependencies i.e. separating data so that any editing of a piece of data only has to be done once

Third normal form: eliminating transitive dependencies i.e. separating data so that when records are deleted other data is not disturbed

Characteristics of high quality data
  1. Correct - accurate
  2. Verifiable - checkable through other existing information
  3. Complete -  data contains all relevant information
  4. Concise - data contains only relevant information
  5. Understandable - most users can readily comprehend what the data means
  6. Current - data is up to date and useful at the current time
  7. Accessible - data available when and where it is required
Problems of poor data quality
  1. Incomplete: not enough information to do a job
  2. out-of-date: does not indicate current state of affairs
  3. not understandable: cannot understand what the data is describing
  4. not where it is needed: cannot use it because it is too far away
  5. not when it is needed: cannot get it until after it is needed

Finish page 202.

References
  1. Andy Oppel (2011), Database Demystified, 2nd Ed, McGraw-Hill.
  2. Melbourne Institute of Technology (Semester 2, 2012). BN105 IT for Users in Organisations Lecture Notes.

MATLAB for image and video processing DAY 01

r = randn(n) returns an n-by-n matrix containing pseudorandom values drawn from the standard normal distribution.
( ): used for indexing
[ ]: used to concatenate numbers, matrix, vectors
{ }: used to represent cell-array

%Read image and convert it into double
y=im2double(imread(image1.pgm'));

% Add the AWGN to the original image y with a zero mean and a standard deviation 'sigma'

       z = y + (sigma/255)*randn(size(y));


% Add the AWGN to the original image y with a mean of 25 and a standard deviation 'sigma'

       z = y + (25 + (sigma/255)*randn(size(y)));


Monday, March 12, 2012

CRM and data mining Day 04

The following shows some reasons to create a data cube.



Data cube.


The architecture behind BI



a clustered index
To add a clustered index to the fact table, usually use one of the date keys.
It will be helpful to have the fact data physically arranged in date order.
It is assumed that you are familiar with views, tables (relational database concepts) and the difference between the two. To avoid unnecessary modifications to the original data, we will make use of views rather than the tables for cube development.

SQL Server Business Intelligence Development Studio (BIDS)

Choose "Native OLE DB\SQL Native Client 10.0" as the provider

Import an existing database using the "restore" option. Note: you might need to use the option overwrite the existing database.

References
Amit Bansal , Microsoft SQL Server 2008 - Create a Cube, http://www.youtube.com/watch?v=aglwqC8irMA&feature=relmfu


Artificial neural networks Day 02

I. Perceptron model using MATLAB to solve linear classification problems.

The following shows steps to create a Perceptron Neural Network model to solve linear classification problems using MATLAB.

1. Define the input e.g. x[0 0 1 1;0 1 0 1];
2. Define the target e.g. t=[0 0 0 1]; %  This target shows that we are solving the AND problem.
3. Define the network architecture i.e. net=perceptron;
OR i.e. net = newp([0 1;0 1],[0 1]); % That is, inputs will be 2-element column vectors in which values for each element are 0 or 1, and output will be 1-element "vector" that has values of 0 or 1.
4. Train the network i.e. net= train(net,x,t);
5. View the network i.e. view(net);
Simulate the network with a new input to obtain the output e.g. y=sim(net,[0;0]) or y=sim(net,[0;1]).

In step 3 above, if the "newp" function is used, we can set the initial set of weights, bias, learning parameter and other parameters (e.g. number of epochs) as follows.

3a. Set initial set of weights and bias
For example,
w = [1 -0.8]; %i.e. w1=1, w2=-0.8
net.IW{1,1} = w;
net.b{1} =  [0]; %i.e. w0=0

3b. Set learning rate
net.trainParam.lr=0.1;

3c. Set the number of epochs for training
net.trainParam.epochs=10;

7. Find out the final weights and bias after training the network.
w = net.iw{1,1}
b = net.b{1}

Note: patternnet should be used to solve non-linearly separable problems.

II. Perceptron architecture


References
MATLAB 2010b Help Documentation

Tuesday, March 6, 2012

Multimedia Day 01

The image canvas is the work area of the image and defines the image dimensions — for example, 200 × 300 pixels. If you need more space to add elements to an image, you can increase the canvas size, or you can resize the canvas to make it smaller.

You can change the image size by resizing or cropping, or by printing at a different size.

Reducing the canvas size is not always the same as cropping an image. For images with layers, reducing the canvas size does not delete the pixels outside the new canvas area — it just shows less of the layer

Pixels have no set physical size. Each pixel represents one sample of a single color. When an image is resized, the number of pixels in the image may be reduced or increased, which causes the image to be resampled. Resampling changes the file size.

Print resolution is defined as the number of pixels per inch (ppi). A higher print resolution creates smaller printed pixels and therefore a smaller printed image. A lower print resolution creates larger printed pixels and a larger printed image. Resizing can be used to:
  1. Change the print resolution and print size while preserving the number of pixels and file size (no resampling)
  2. Change the number of pixels and file size while preserving the print resolution and print size (resampling) 
  3. Do both (resampling)
Figure below shows the materials palette



Here are some recommendations to help you resize your images:
  1. Avoid increasing the image size by more than 125%. Doing so may cause a loss of detail and sharpness.
  2. Resize an image only once. If you resize the image incorrectly, undo it and try again.
  3. Correct and retouch images before resizing.
When you create or import an image in Corel Paint Shop Pro Photo, the image has a single layer. Depending on the type of image you create or import, the single layer is labeled as Background, Raster, Vector, or Art Media. When you open a photo, scan, or screen capture, the single layer is labeled as Background on the Layers palette. 

For most simple corrections and retouching, you do not have to add layers to an image. However, it is a good practice to duplicate the single layer before making image corrections, so that you preserve the original image on its own layer. If you intend to do more complex work — such as adding elements to the image, creating photo compositions, adding text, or applying other effects — the use of layers is highly recommended.

Each layer you add begins as a transparent sheet over the background. As you add brush strokes, vector objects, or text, you cover up parts of the Background layer. Transparent areas allow you to see the underlying layers. You can stack multiple layers to create artistic compositions, photo collages, or complex illustrations.

There are nine types of layers: Background, Raster, Vector, Art Media, Mask, Adjustment, Group, Selection, and Floating Selection. For information about the last three types, see "Working with selections.

1. Background layers

The background layer  is the bottom layer of an image. When you import JPEG, GIF, or PNG images into Corel Paint Shop Pro Photo, they have only this single layer, which is named “Background” on the Layers palette.

The background layer contains raster data, but it cannot display transparency. You cannot change its blend mode, opacity, or order in the stack until you promote it to a raster layer.

To position a background layer higher in the layer stack, you can promote it to a regular raster layer. For information about promoting the background layer, see "Promoting the background layer."

If you create a new image with a transparent background, it does not have a background layer, and its bottom layer is a raster layer named “Raster Layer 1.” You can move this layer anywhere within the stacking order. You can also change its opacity and blend mode.

Corel Paint Shop Pro Photo supports transparency on layers other than the background. To create an image without transparent areas, you can choose a solid-colored background. The image contains a background layer as the bottom layer. 

2. Raster layers

Raster layers are layers with raster data only. Raster data is composed of individual elements, called pixels, arranged in a grid. Each pixel has a specific location and color. Photographic images are composed of raster data. If you magnify raster data, you can see the individual pixels as squares of colors.

Raster layers let you display subtle changes in tones and colors. Some tools, options, and commands apply only to raster layers. For example, the painting tools and the commands that are used to add effects can be applied only on raster layers. If you try to use a raster tool while a vector layer is selected, Corel Paint Shop Pro Photo prompts you to convert the vector layer into a raster layer.

Only grayscale and 16 million–color images can have multiple raster layers. When you add a new adjustment layer or raster layer to an image of another color depth, Corel Paint Shop Pro Photo prompts you to convert it to 16 million colors.

For more information about raster and vector data, see "Understanding raster and vector objects."

3. Vector layers

Vector layers are layers with only vector objects (lines and shapes), vector text, or vector groups. Vector objects and text are composed of geometric characteristics — lines, curves, and their locations. When you edit vector objects and text, you edit these lines and curves, rather than the individual pixels. Vector graphics and vector text maintain their clarity and detail at any size or print resolution.

Objects or text created with vector layers can be easily edited. Images of any color depth can include multiple vector layers. Each vector layer contains a list of all individual vector objects on that layer. You can expand or collapse the group to view the individual objects. For more information, see "To expand or collapse a vector layer or layer group." Vector objects can be moved from their layer group to another vector group.

You cannot move a vector object to a nonvector layer; vector objects must be placed on vector layers. If you create a vector object while a raster layer is selected, Corel Paint Shop Pro Photo creates a vector layer just above the selected layer.

For more information about raster and vector data, see "Understanding raster and vector objects."

4. Art Media layers

Art Media layers are automatically created when you use any of the Art Media tools.

When creating a new image, you can choose to create the image with an Art Media layer.

Art Media layers can be converted to raster layers, but not to vector layers.

5. Mask layers

Mask layers show or hide portions of underlying layers. A mask is an adjustment layer that modifies opacity. You can use masks to create sophisticated effects, such as a picture frame that fades away at the center to reveal the subject.

Mask layers cannot be bottom layers. You cannot delete other layers if deleting them would cause a mask layer to become the bottom layer.

For more information about working with masks, see "Working with masks."

6. Adjustment layers

Adjustment layers are correction layers that adjust the color or tone of underlying layers. Each adjustment layer makes the same correction as an equivalent command on the Adjust menu, but unlike the command, the adjustment layer does not change image pixels.

Adjustment layers cannot be the bottom layer. You cannot delete other layers if deleting them would cause an adjustment layer to become the bottom layer. For more information, see "Using adjustment layers."

Management: the act of managing something

Change management: change management is a structured approach to shifting/transitioning individuals, teams, and organizations from a current state to a desired future state. It is an organizational process aimed at helping employees to accept and embrace changes in their current business environment. Change management's goals are to minimize the change impacts on workers and avoid distractions.

Digital animation vs. traditional animation drawn on sheets of acetate

Three file formats that can be used to deliver animation.

Uses of multimedia are to enhance communication, provide training or entertainment (e.g gaming).

Applications of multimedia are in marketing , product training, product demonstration, advertising engineering education and simulation, medicine education, journalism,  catalogs, databases and networked communications


Glossary of terms
Typographic term
Team effectiveness
persistence of vision

References
Wikipedia, http://en.wikipedia.org/wiki/Change_Management
Corel Paint Shop Pro XI Help Document

CRM and data mining Day 03

Gain a better understanding of what are the key motivations of choice and change.
Profile the things they do, the brands they choose, the media they consume.
There area ten mindset segments of the Australian population based on the deeper drivers of choice and change.

There are four human social dimensions (Individualism, Life Satisfaction, Conservatism and Innovation) and two dimensions that establish the Values Segments in marketplace reality (Quality Expectations and Price Expectations).

VALS is a consulting and consumer research service. VALS consulting provides clients with tailored marketing strategies for targeting, positioning, and communications—real-world, real-time, actionable strategies. Consumer surveys and focus groups inform our work.

Investigate what would change a no decision into a yes and what factors influence and predict the behaviours.

Study current market segmentation and customer understanding methods

To understand behaviours of consumers

Understanding the needs of various consumer groups guides new-product and services development.

E.g. 01: A European luxury automobile manufacturer used VALS to identify online, mobile applications that would appeal to affluent, early-adopter consumers within the next five years.

E.g. 02: A major telecommunications-product company used VALS to select an early-adopter target for a new telecommunications concept. VALS enabled the company to develop the product prototype and prioritize features and benefits, with a focus on the early-adopter target. The company used VALS to select the best name and logo, choose an overall positioning strategy, and set an initial price point.

E.g. 03 A Minnesota medical center planned to offer a new line of service: cosmetic surgery. It used VALS Focus Groups to identify target consumers (people most interested in and able to afford the service). By understanding the underlying motivations of the target, the center and its ad agency were able to develop a compelling selling proposition. The resulting advertising was so successful that just a few weeks into the campaign, the center exceeded its scheduling capabilities.

E.g. 04: A leading U.S. bank used VALS/MacroMonitor data from Consumer Financial Decisions to reposition several ubiquitous products in commodity categories. By understanding the emotional benefits sought by target consumers, the advertising agency was able to define unique selling propositions for each product that linked to the corporate branding strategy. The bank achieved about one-third of its customer-acquisition goal 12 weeks into the first campaign.

Marketers can use MyBestSegments to guide marketing campaigns and media strategies for specific market segments by answering:
  1. Who are the potential customers?
  2. What are they like?
  3. Where can I find them?
  4. How can I reach them? 
References
  1. http://www.strategicbusinessinsights.com/vals
  2. http://www.strategicbusinessinsights.com/vals/presurvey.shtml
  3. http://www.roymorgan.com/products/values-segments/values-segments.cfm
  4. http://www.claritas.com/MyBestSegments/Default.jsp
  5. http://www.spatialinsights.com/catalog/product.aspx?product=80&content=1386
  6. http://calabash.ca
  7.  http://www.abc.net.au/iview/#/view/901523 (from ABC "Lateline" TV Wedn 29/Feb/2012).
  8.  www.metafilter.com/113095/If-you-have-nothing-to-hide-why-do-you-have-curtains-on-your-windows
  9.  https://www.eff.org/deeplinks/2012/02/how-remove-your-google-search-history-googles-new-privacy-policy-takes-effect

Monday, March 5, 2012

How to begin a paragraph


Your work needs to be a coherent, flowing and logical piece. That is, it should have a clear and understandable progression and structure.

Remember to succinctly, identify the key paragraphs and/or sections of your work during your introductory paragraph. Then restate them along side an unambiguous position in your concluding paragraph.

Linking words and phrases are used to clearly link the start of each new paragraph, with the preceding paragraph. The following linking words and phrases can be used at the start of new paragraphs.

Although, …

As a consequence, …

As a result, …

As we have seen, …

At the same time, …

Accordingly, …

An equally significant aspect of…

Another, significant factor in…

Before considering X it is important to note Y

By the same token, …

But we should also consider, …

Despite these criticisms, …it’s popularity remains high.

Certainly, there is no shortage of disagreement within…

Consequently, …

Correspondingly, …

Conversely, …

Taylor, … in particular, has focused on the

Despite this, …

Despite these criticisms, … the popularity of X remains largely undiminished.

Each of these theoretical positions make an important contribution to our understanding of, …

Evidence for in support of this position, can be found in…,

Evidently,

For this reason, …

For these reasons, …

Furthermore, …

Given, the current high profile debate with regard to, …it is quite surprising that …

Given, the advantages of … outlined in the previous paragraph, …it is quite predictable that …

However, …

Having considered X, it is also reasonable to look at …

Hence, …

In addition to, …

In contrast, …

In this way, …

In this manner, …

In the final analysis, …

In short, …

Indeed, …

It can be seen from the above analysis that, …

It could also be said that, …

It is however, important to note the limitations of…

It is important however not to assume the applicability of, …in all cases.

It is important however not to overemphasis the strengths of …

In the face of such criticism, proponents of, …have responded in a number of ways.

Moreover, …

Notwithstanding such criticism, ….it’s popularity remains largely undiminished.

Notwithstanding these limitations, ….it worth remains in a number of situations.

Noting the compelling nature of this new evidence, …has suggested that.

Nevertheless, …remains a growing problem.

Nonetheless, the number of, …has continued to expand at an exponential rate.

Despite these criticisms, …it’s popularity remains high.

On the other hand, critics of, …point to its blindness, with respect to.

Of central concern therefore to, …sociologists is explaining how societal processes and institutions…

Proponents of…, have also suggested that…

Subsequently, …

Similarly, …

The sentiment expressed in the quotation, embodies the view that, …

This interpretation of, … has not been without it’s detractors however.

This approach is similar to the, …. position

This critique, unfortunately, implies a singular cause of, …

This point is also sustained by the work of, …

Thirdly, …

The use of the term, …

Therefore, …

There appears then to be an acceleration in the growth of

There is also, however, a further point to be considered.

These technological developments have greatly increased the growth in, …

Thus, …

To be able to understand, …

Undoubtedly, …

Whilst the discussion in the preceding paragraph, …

Whether crime rates were actually lower at this time continues to be a matter of debate. Evidence from…

Sunday, March 4, 2012

Python programming Day 01

Glossary 

#Extract URL in a web page.
page=contents of some web page as a string
start_link=page.find('<a href=')
start_quote=page.find('"',start_link)
end_quote=page.find('"',start_quote + 1)
url=page[start_quote+1:end_quote]
print url
page=page[end_quote:] #all the remaining text of page started from the position end_quote.

#Find a target string t started from the location i in string s.
s.find(t,i)

#Define a function
def summation(a,b):
    return a+b
print summation(4,5)

#Waiting for user's input
a_str=raw_input('What is your name? ')
print 'Hello' + a_str

#Using the 'import' statement to import a library
import math
print math.sqrt(9)
print math.pow(5, 7)

Find the position of the second occurrence of string 't' in string 's'.

def find_second(s,t):
    first=s.find(t)
    return s.find(t,first+1)
E.g. print find_second('Hello David, Hello Dave'); #The result will be 13.

#A function uses IF statement to find out the bigger number between the two numbers

def bigger(n1,n2):
    if n1<n2:
        return n2
    else:
        return n1     
print bigger(3,8)

#A function uses IF statement to check if the first letter of a string is started with 'D'.

def is_friend(s):
    if s[0]=='D': #indexing the string using [ ]
        return True
    else:
        return False     
print is_friend('Dave')

#Another procedure uses IF and ELSE statement with 'or' logical operator

def is_friend(name):
    if (name[0] == 'D' or name[0] == 'N'):
        return True
    else:
        return False     
print is_friend('David')
print is_friend('Neville')
print is_friend('Tom')

#Another procedure uses to find the biggest number among the three numbers
def biggest(x,y,z):
    if (x >= y and x >=z):
        return x
    else:
        if (y >= x and y>=z):
            return y
        else:
            return z
print biggest(9,7,8)

#A better solution than the previous
def biggest(x,y,z):
  return bigger(bigger(x,y),z)
def bigger(x,y):
if x>y:
return x
else:
return y

# A procedure uses a 'while' loop to print all numbers up to n (n is included)
def print_numbers(n):
i=1;
while i <= n:
                print i
                i=i+1
 print_numbers(10)

# A procedure use 'while' loop to compute the factorial of number n

def factorial(n):
    if n==1:
        return 1
    while(n>0):
        return n*factorial(n-1)
print factorial(4)

# Another procedure uses 'while' loop to compute the factorial of number n
def factorial(n):
      result=1
      while(n >=1):
              result=result*n
              n=n-1
      return result
print factorial(4)

#The procedure 'get_next_target' is used to extract URL and the position of the end_quote
def get_next_target(page):
    start_link = page.find('<a href=')
    if start_link == -1:
        return None, 0
    else:
        start_quote = page.find('"', start_link)
        end_quote = page.find('"', start_quote + 1)
        url = page[start_quote + 1:end_quote]
        return url, end_quote
               
page='This is the udacity link <a href="http://udacity.com">'
url,post=get_next_target(page)
print url
print post

#Print all links in the page 'http://xkcd.com/353'

def print_all_links(page):
while True:
url, endpos = get_next_target(page)
if url:
print url
page = page[endpos:]
else:
break
      print_all_links(get_page('http://xkcd.com/353'))

Every possible computer program can be written using the following concepts.
  1. arithmetic
  2. comparisons
  3. if
  4. procedures
References
Udacity, CS1 Lecture Notes, February-April 2012

Data mining for knowledge discovery DAY 01

Data mining for knowledge discovery

Disadvantages

Advantages

Support Vector Machines Day 01

Support Vector Machines (SVM)
Disadvantages

Advantages

Artificial neural networks Day 01

I. Artificial neural networks

Limitations
  1. Hard to explain the solution (explanation capabilities)
  2. Have to adjust (fine-tune) many parameters (e.g. training dataset, validation dataset, test dataset, network architecture (feedforward, recurrent/feedback networks), number of layers, number of hidden neurons, learning rate, initial weights, number of epochs, accuracy (error tolerance))
  3. Require a lot of resources (resource-intensive): time consuming, fast computers
Benefits
  1. Solve many problems: predictions, classifications, clustering
  2. Enable paralleled processing, speeding up certain computations
II. Perceptron model for linear data analysis

Single layer perceptron for linear data analysis (e.g. pattern recognition)

Linearly separable classification problems (linearly separable patterns) vs non-linear pattern recognition problems

III. Neural networks for non-linear processing

The following neural networks can be used for non-linear processing (non-linear data analysis)
  1. Multilayer perceptron (MLP) networks
  2. Radial basis function (RBF)
  3. Polynomial nets
  4. Generalized regression neural networks (GRNN)
  5. Generalize neural networks (GNN)
Training algorithms in MLP networks
  1. Backpropagation learning/ training algorithm (i.e. The error is propagated back from the output to adjust the weights.). The learning challenge is to find weights that result in the minimum error for the whole training dataset.
A multilayer perceptron network is considered as a "universal approximator".
There is no theory yet on how many units in the hidden layer that can approximate any given function.

Multicategory classification problem (i.e. multi-output perceptron)

References
Grace Rumantir, Monash University FIT5167 Lecture Notes, 2012.

Glossary
Linear or non-linear neurons refer to the linear or non-linear activation function.
Neuron: a nerve cell in a biological nervous system.
Axon: outgoing terminals from a biological neuron.
Dendrites: the incoming connections to biological neurons.
Synapse: the area of electrochemical contact between two biological neurons.
Biological neural networks are assumed to comprise the human's brain. The brain is composed of millions of networks each having several thousands of highly interconnected neurons. Networks are integrated with other networks (but not with all of them).
An artificial neural network is a computerized system which structure and operation take the metaphor of those of a biological network.




Saturday, March 3, 2012

The use of laptops: needs for students

Attentiveness
Engagement
Learning

When students can pose questions via their laptops, the number of questions is higher than in traditional classes.
Students with laptops received slightly higher grades.
However, students reported that their laptops and those of their classmates are a distraction.

LectureTools is an interactive suite of web-based tools designed to allow questioning practices in lecture that actively engage students and go beyond the multiple choice format typically supported by classroom response systems (clickers).

LectureTools gives students the ability to take notes and make drawings on PowerPoint slides, rate their understanding of each slide, pose questions anonymously during the lecture, and review the recorded lecture after class.

LectureTools is a great way to take notes and stay alert in class. It helped me learn a great deal more.

Students with their laptops in classes spent more than ten minutes per class using social networking sites and email. 

Many students appear to weigh the options of using or not using laptops during class and make decisions based on what may be most helpful for their own learning. 

Lecturers need to
  1. Know how and when to ask students to use their laptops, rather than simply allowing students to bring them to class
  2. The questions posed by the instructor in LectureTools helped them better understand and learn lecture material.
  3. Faculty will need to think carefully about their approach to student laptop use and how they can maximize the benefits while minimizing the distraction.
  4. Set a laptop policy and communicate it to students
  5. Students are not encouraged to bring laptops to class. A closed laptop rule during lecture will be enforced and other communication devices will need to be on silent during lecture.
  6. When you use laptops during class, do not use laptops for entertainment during class and do not display any material on the laptop which may be distracting or offensive to your fellow students.
  7. Laptops may be used only for legitimate classroom purposes, such as taking notes, downloading class information, or working on an in-class exercise. E-mail, instant messaging, surfing the Internet, reading the news, or playing games are not considered legitimate classroom purposes; such inappropriate laptop use is distracting to those seated around you and is unprofessional.
  8. Identify a laptop-free zone in class
References
Tomorrows-professor Digest, Vol 62, Issue 1

Software for having virtual machines

Software for having virtual machines run
  1. VMware Player 4.0.2
  2. Oracle VM VirtualBox 4.1.8
  3. Microsoft Virtual PC 6.0.192
  4. KVM
  5. Ubuntu 11.10 on VM requires at least 4.4GB. Hence, 8GB USB disk is recommended.
  6. On VirtualBox Right Ctrl is used to escape mouse pointer from the VM window.
  7. Various virtual machines on VMware player can be obtained from http://virtual-machine.org/. Most of usernames are root and tom, and password is tomtom.
  8. Oracle VirtualBox and Microsoft Virtual PC seem to run quite fast.
  9. In CentOS6.2 need to find flash plug-in for firefox. Note need to log in with root user first.
  10. Need to install VirtualBox 4.1.8 Oracle VM VirtualBox Extension Pack  to access USB devices.
  11. Need to use Virtual Media Manager to release old virtual disks if they are moved to other places.

Thursday, March 1, 2012

Articles, books, notes to read from 1-8 March 2012

  1. Image denoising algorithm via best wavelet packet base using Wiener cost function
  2. Adaptive surveillance video noise suppression
  3. Robust video denoising using Low rank matrix completion
  4. Wavelet based nonlocal-means super-resolution for video sequences
  5. Image sequence denoising via sparse and redundant representations
  6. Image and video denoising using adaptive dual-tree discrete wavelet packets
  7. Digital Image Restoration
  8. SURE-LET for orthonormal wavelet-Domain Video Denoising
  9. An augmented Lagrangian method for total variation video restoration
  10. Sparse representation for color image restoration
  11. Fractal Image Denoising
  12. Variational methods for image restoration
  13. The what, how, and why wavelet shrinkage denoising
  14. Bivariate Shrinkage Functions for Wavelet-Based Denoising Exploiting Interscale Dependency
  15. Geometric Features-Based Filtering for Suppression of Impulse Noise in Color Images
  16. Combined Wavelet-Domain and Motion-Compensated Video Denoising Based on Video Codec Motion Estimation Methods
  17. Image Denoising Via Sparse and Redundant Representations Over Learned Dictionaries
  18. Vector filtering for color imaging
  19. Threshold selection for wavelet shrinkage of noisy data
  20. Fast Image Recovery Using Variable Splitting and Constrained Optimization
  21. Denoising methods: K-SVD, MK-SVD, BM3D
  22. Image denoising using multi-stage sparse representations
  23. ECE 101 signal and systems
  24. Install Ubuntu 11.10 in VirtualBox on Windows 7 http://www.youtube.com/watch?v=R1UiDF45tbs
  25. D. L. Donoho, “De-Noising by Soft-Thresholding,” IEEE Transaction on Information Theory, Vol. 41, No. 3, 1995, pp. 613-627.
  26. Y. F. Zheng and R. L. Ewing, “Feature-Based Wavelet Shrinkage Algorithm for Image Denoising,” IEEE Trans-action on Image Processing, Vol. 14, No. 12024-2039.
  27. M. Nasri and H. Nezamabadi-pour, “Image Denoising in the Wavelet Domain Using a New Adaptive Thresholding Function,” Neurocomputing, Vol. 72, No. 4-6, 2009, pp. 1012-1025.
  28. Spectrum signal de-noising based on wavelet packet


Mounting USB drives in Windows Subsystem for Linux

Windows Subsystem for Linux can use (mount): SD card USB drives CD drives (CDFS) Network drives UNC paths Local storage / drives Drives form...