Wednesday, February 29, 2012

Running a Linux web server Day 01

with a root user
create #pico /etc/apache2/sites-avaliable/example.org.conf
run     #a2ensite example.org.conf

check the configuration #apache2ctl configtest

run     #apache2ctl graceful
run     #a2enmod userdir
run    #apache2ctl graceful

Install a web application
run    #apt-get install unzip imagemagick

set up mysql database
download coppermine project

cp coppermine project to /var/www/
unzip it and change the name cpg15x to photos
chmod 777 include
chmod albums to 777

CRM and data mining Day 02

Foreign keys must match corresponding PKs to maintain referential integrity.

Surrogate key is a unique value, usually an integer assigned to each row in the dimension. It becomes the PK.

An OLAP engine such as Analysis Services provides a better query language (than SQL) and better computational performance.

Entity Relationship modelling is a technique used to 'abstract' users' data requirements into a model that can be analyzed and ultimately implemented. The objectives of ER modelling are
  1. Avoid anomalies, and achieve processing and data storage efficiency by reducing data redundancy (storing data elements once).
  2. Provide flexibility and ease of maintenance.
  3. Protect the integrity of data by storing it once.
ER modelling and normalization simplify transaction processing as they make transactions as simple as possible (data is stored in one place only). However, normalized databases become very complex, making queries difficult and inefficient. A 'spider-web of joins' is required for many queries.

Every dimensional model is composed of one table with a multi-part key, called the fact table, and a set of smaller tables called dimension tables.

Each dimension table has a single-part primary key that corresponds exactly to one of the components of the multi-part key in the fact table.

A single ER diagram breaks down into multiple Dimensional Model diagrams, or 'stars'.

ER modelling (arguably) does not really model a business,; rather, it models the micro-relationships among data elements.

Facts are the 'verbs or actions' of the business. E.g. taking an order, displaying a web page, printing a book,  handling a customer support request. 

A value in a fact table is a measurement. E.g. quantity ordered, sale amount, call duration.

The level of detail in a fact table is called the grain. It is recommended that the grain be kept to lowest (or finest) level. E.g. one row per sale, one row per service call.

A fact always 'resolves' a many-to-many relationship between the parent (or dimension) tables.

The most useful facts in a fact table are numeric and additive. E.g. sales by month for the last year.

The best way to identify dimensions is to note down every time some one says the word 'by'. For example,
-Sales by manufacturing plant
-Deliveries by method of shipment


"Sales" will become a measure. Manufacturing plant becomes a dimension table.
"Deliveries" becomes a measure. Method of shipment becomes a dimension table.


All foreign keys in the fact table are not allowed NULL.
Measures columns are allowed NULL values.


References
  1. FIT5158 Monash University Lecture Notes, 2011
Glossary of terms
Dimensional analysis
Multidimensional conceptual view
Intuitive and high-performance retrieval of data.
High-performance access
Fact tables: contain the measurements associated with a specific business process.
Dimension tables
'Star-like' structure
Star join
Star schema
Dimensional model vs ER model
Multi-part primary key: is a primary key consisting of a set of different keys from dimension tables.
One-to-many relationship
Many-to-many relationship
Many-to-one relationship
Referential integrity
Foreign key (FK)
ETL: extract, transfer and load.
SCD: slowly changing dimensions
Dimensional modelling concepts
Conformed dimensions

Tuesday, February 28, 2012

CRM and data mining Day 01

The update time: the length of time it takes for a change in the operational data to reflect in the data warehouse. A 24 hour 'wrinkle of time" is recommended for the update time.

low-level granularity  vs. high-level granularity
The lower the level of granularity, the more versatile the query that can be issued.

high-level of detail (the details of every phone call made by a customer for a month) vs. low-level of detail (the summary of phone calls made by a customer for a month)


Granularity: is the level of detail or summarization of the units in the DW. E.g. details of every phone call made by a customer for a month vs the summarized phone calls made by a customer for a month. You can always aggregate detailed data by summarizing, but cannot disaggregate data that is stored only as summaries. This is the benefit of low-level granularity.

Customer relationship management (CRM) promises
  1. Faster customer service at lower costs
  2. Higher customer satisfaction
  3. Better customer retention
Ultimately, these lead to customer loyalty .

Essential questions for business that can be solved by computers (decision support systems).
  1. Who are my customers and what products are they buying?
  2. Which customers are most likely to go to the competition?
  3. What impact will new products/services have on revenue and margins?
  4. What product promotions have the biggest impact on revenue?
  5. What is the most effective distribution channel?
  6. Which are our lowest/highest margin customers?
Organizations need to think about finding products for the customers rather than customers for their products

A key focus of business intelligence is optimizing the lifetime value of customers. To do this we need to
  • Get to know the customers better -> create customer profiles
  • Interact appropriately with the customers ->
Customer intelligence or 'Analytical CRM' is the process of gathering, analyzing and exploiting information of a company's customer base.

There are 3 main steps:
  1. Build a data warehousing for customer intelligence.
  2. Use decision support tools (OLAP and reporting)
  3. Data mining
Data warehouses enable organizations capture, organize and analyze the information in order to make decisions about the future based on the past. The steps are as follows.
  1. Identify the data that needs to be gathered from the operational business systems,
  2. Place data in the data warehouse
  3. Ask questions of the data to derive valuable information.
Operational information: questions about situations right now
  1. How many unfulfilled orders are there?
  2. On which items are we out of stock?
  3. What is the position of a particular order?
Strategic information is concerned with sales of products over time. A 'mail order wine club' may wish to ask
  1. Which product lines are increasing in popularity and which are decreasing?
  2. Which product lines are seasonal?
  3. Which customers place the same orders on a regular basis?
  4. Are some products more popular in different parts of the country?
  5. Do customers tend to purchase a particular class of product?
Data warehouse is "A subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making processing" (W.H. Inmon)

Data warehouse is "A copy of transaction data, specifically structured for query and analysis" (Ralph Kimball)

Data warehouse data is a series of snapshots, each snapshot taken at one moment in time.

Data warehouse (DW) data always contains a time element-with entries for different values of time.

References
  1. FIT5158 Monash Lecture Notes, 2011
Glossary of terms

Fact table vs. dimension table
Data mart
Star join is an ideal structure for a data mart.
The centre of the star join is called the 'fact table'.
Agility
Integrated computerized decision support
Analysis, decisions, predictions
Margin customers
The lifetime value of customers
Customer-centric businesses
Customer-centric organisations
Elaborate: work out in detail (e.g. elaborate a plan)
Blanket marketing campaigns vs. targeted marketing campaigns
Monetary
OLAP: online analytical processing
OLTP: online transaction processing
A single unified data repository
Repetitive
Decision support system data (DSS data)
Clerical functions vs. managerial functions
Primitive data vs. derived data
Collective time horizon
The volume of data in the warehouse

Sunday, February 26, 2012

Questions for the revised Bloom's taxonomy

Learning levels 1, 2, and 3



Learning levels 4, 5, and 6



References
http://www.cccs.edu/Docs/Foundation/SUN/QUESTIONS%20FOR%20THE%20REVISED%20BLOOM.doc

Thursday, February 23, 2012

Wednesday, February 22, 2012

Research Day 01 :Writing research plan, Types of research

I. Writing a research plan needs to answer the following.
  1. What do you intend to do?
  2. Why is the work important?
  3. What has already been done?
  4. How are you going to do the work?
A Research design is concerned with turning a research question into a testing project. The research design has been considered a "blueprint" for research, dealing with at least four problems: what questions to study, what data are relevant, what data to collect, and how to analyze the results.

Research method: determine a way of doing a systematic investigation to establish facts. It needs to include step-by-step to achieve the goal.

II. Types of Research
1.Quantitative research

  • Used to test objective theories by examining the relationship among variables
  • Used to quantify the data and generalise the results from the sample to the population of interest
  • Used to recommend a final course of action
  • Used statistical data analysis

2. Basic research

  • Used to understand and generate more knowledge, e.g.
  • How to increase the productivity of the employees.
  • How to increase the effectiveness of small businesses.
  • How to improve the effectiveness of information systems.
  • Einstein's Theory of Relativity
  • Newton's Contributions

3. Applied research

  • Used to find solutions to everyday problems, cure illness, and develop innovative technologies, rather than to acquire knowledge for knowledge's sake.
  • Improve agricultural crop production
  • Treat or cure a specific disease
  • Improve the energy efficiency of homes, offices, or modes of transportation

4. Longitudinal research
5. Qualitative research

  • Is primarily exploratory research
  • Used to gain an understanding of underlying reasons, opinions, and motivations.
  • Provides insights into the problem or helps to develop ideas or hypotheses for potential quantitative research.
  • Used to develop an initial understanding
  • Used non-statistical data analysis
  • Useful for finding detailed information about people's perceptions and attitudes

6. Descriptive research
  • Often used to determine characteristics of the target market
  • Answers questions like, "who, what, where, when, and how often" questions
  • Fact finding investigation
7. Classification research
8. Exploratory research
  • Used when the problem is not well-defined
  • Used to determine the best research design, data collection method and selection of subjects.
  • Used to provide significaion insight into a given situation
  • It is not typically generalizable to the population at large
  • It relies on secondary research such as reviewing available literature and/or data, or qualitative approaches such as informal discussions with consumers, employees.
  • Answers question like: "Why are ticket sales down?"
9. Causal research
  • Explores the relationship between two variables
  • Answers questions like: "Does increased advertising result in more ticket sales?"
II. Qualitative data vs quantitative data
1. Qualitative data is information about qualities; information that can't be measured, e.g. gender,  softness of skin, softness of a cat, color of the sky, color of your eyes, agree-to-disagree something/opinion/statement
2. Quantitative data: height, weight, blood pressure, shoe size, the length of the fingernails.

Tuesday, February 21, 2012

Course description, learning outcomes, learning activities, workload, assessment tasks

Course description
This course develops an understanding of how to carry out different image processing tasks.

Learning outcomes
Upon completion of this course, students will be able to:
  • Understand the process and techniques used to capture, enhance, restore digital images
  • Understand o the process and techniques used to encode and compress digital images
  • Know how to select an appropriate processing method for a given image processing problem
  • Explain how video sampling and rate conversion are carried out.
The learning activities include the following.
  • Attend lectures where syllabus material will be presented and explained, and the subject will be illustrated with demonstrations and examples; 
  • Complete tutorial questions and lab projects designed to give further practice in the application of theory and to give feedback on student progress and understanding; 
  • Complete written lab report consisting of numerical and other problems requiring an integrated understanding of the subject matter; and 
  • Survey and summarise the current literature to gain a sense of the current state of the art in the subject domain
  • Carry out private study, work through materials as presented in classes and in tutorial/lab session(s), and gain practice at solving conceptual and numerical problems.
Workload
Two-hour lecture and two-hour tutorial (or laboratory) (requiring advance preparation),
A minimum of 2-3 hours of personal study per one hour of contact time in order to satisfy the reading and assignment expectations.
You will need to allocate up to 5 hours per week in some weeks, for use of a computer, including time for  group work/discussions,

Assessment tasks
  • Final examination – assessment of theoretical knowledge and application 
  • Assignments – literature review and summary of a specialised topic 
  • Practical (lab exercises) – learning will be enhanced with lab exercises that encourage exploration of the topic. Laboratory reports are due one or two weeks after completion of the scheduled laboratory session.

Monday, February 20, 2012

Business processes, business analysis cycle

Business processes

Advertising (TV, Print, Online, Radio)
Promotions
Co-op programs
Web Site Marketing
PR
Orders Forecasting
Reseller Orders
Internet Orders
Purchasing
Parts Inventory
Manufacturing
Finished Goods Inv.
Shipping
Returns
Registration cards
Customer calls
Web support
Financial Forecasting
Exchange Rage Management
GL-Revenue & Expense
Cost Accounting
Payroll
Benefits Enrollment

Business analysis cycle

Plan->Implement->Monitor->Assess->Plan

Denoising and linear inverse problem

optimal diagonal estimator

Sunday, February 19, 2012

Compressed sensing

Compressed sensing (CS) was motivated in part by the desire to sample wideband signals at rates far lower than the Shannon–Nyquist rate, while still maintaining the essential information encoded in the underlying signal.

A common approach in engineering is to assume that the signal is bandlimited, meaning that the spectral contents are confined to a maximal frequency.

Bandlimited signals have limited time variation, and can therefore be perfectly reconstructed from equispaced samples with rate at least 2B, termed the Nyquist rate.

Conversion speeds which are twice the signal’s maximal frequency component have become more and more difficult to obtain.

A common practice in engineering is demodulation in which the input signal is multiplied by the carrier frequency of a band of interest, in order to shift the contents of the narrowband transmission from the high frequencies to the origin.

A “holy grail” of CS is to build acquisition devices that exploit signal structure in order to reduce the sampling rate, and subsequent demands on storage and DSP. In such an approach, the actual information contents dictate the sampling rate, rather than the dimensions of the ambient space in which the signal resides.

At its core, CS is a mathematical framework that studies accurate recovery of a signal represented by a vector of length from M << N measurements, effectively performing compression during signal acquisition.

The measurement paradigm consists of linear projections, or inner products, of the signal vector
into a set of carefully chosen projection vectors that act as a multitude of probes on the information contained in the signal.

In CS we do not acquire x directly but rather acquire M < N linear measurements y=phi*x using an M*N CS matrix phi.

y is the measurement vector. Ideally, the matrix phi is designed to reduce the number of measurements M as much as possible while allowing for recovery of a wide class of signals x from their measurement vectors y.

For any particular signal x_0 in R^N, an infinite number of signals x will produce the same measurements y_0= phi * x_0 = phi * x for the chosen CS matrix phi.

Sparsity is the signal structure behind many compression algorithms that employ transform coding, and is the most prevalent signal structure used in CS.

References
Marco F. Duarte and Yonina C. Eldar, Structured Compressed Sensing-From Theory to Applications, IEEE Transactions On Signal Processing, September 2011

DCT, DFT, FFT, unitary transform, autocorrelation matrices

It is much more efficient to decorrelate an image in the frequency domain than in the spatial domain.

It is more efficient to capture certain features of an image in the frequency domain for pattern classification and identification purposes than in the pixel domain.

For an image of size N × N pixels, a unitary transform implies that the image can be represented as a linear combination of N^2 basis images. These basis images may be independent of the image being transformed as in DCT, or may be computed from the image itself as in KLT.

If the elements of a unitary transform matrix A are real, then A is called an orthogonal transform and its inverse is its own transpose, that is, inverse(A) = transpose(A). However, a unitary transform needs not be orthogonal.

There are two implementations of the FFT algorithm called decimation in time and decimation in frequency, both of which use the divide and conquer rule. In decimation in time, we divide the input sequence into even and odd indexed sequences.

References
K. S. Thyagarajan, Still image and video compression with MATLAB, John Wiley & Sons, 2011.

Saturday, February 18, 2012

Concepts and problems to learn

Convex problem
Combinatorial optimization problem
|supp(theta)|: the size of the set of nonzero entries of the coefficient vector theta.

Friday, February 17, 2012

Sequence, series, approximation, representation, sparsity

In mathematics, a sequence is an ordered list of objects (or events). Like a set, it contains members (also called elements or terms), and the number of terms (possibly infinite) is called the length of the sequence. Unlike a set, order matters, and exactly the same elements can appear multiple times at different positions in the sequence. A sequence is a discrete function.

A series is, informally speaking, the sum of the terms of a sequence. Finite sequences and series have defined first and last terms, whereas infinite sequences and series continue indefinitely.

Approximation theory
Image representation
Sparsity signal processing

Articles, books to read 20-26 Feb 2012


  1. The analysis (Co-)Sparse Model (Analysis_MIA_2012.pptx) YES
  2. Structured Compressed Sensing: From Theory to Applications
  3. Sparsity Equivalence of Anisotropic Decompositions
  4. http://users.ece.gatech.edu/~justin/ECE-8823a-Spring-2011/Course_Notes.html
  5. Sparse models in machine learning
  6. compressive imagers
  7. analog to information converter
  8. computational photography
  9. Generalized sparsity in high-dimensional geometries
  10. http://cam.mathlab.stthomas.edu/wavelets/ima2011workshop.php
  11. Denoising by higher order statistics
  12. http://www.ceremade.dauphine.fr/~peyre/mspc/mspc-mia-12/
  13. http://fourierandwavelets.org/more.php
  14. Fourier and Wavelet Signal Processing
  15. CRM-Mallat-Course1, 2 , 3 , 4
  16. Compressed sensing cs-tutorial-ITA-feb08-complete
  17. low-rank approximation using the SVD and total least
  18. the Kalman filter
  19. changing the sampling rate digitally
  20. the Shannon and Haar filter banks
  21. linear vector spaces and bases
  22. Shannon and Haar wavelet bases
  23. wavelets and filter banks
  24. examples of linear inverse problems
  25. the SVD and the least-squares problem
  26. structured systems: circulant, Toeplitz, and identity+low-rank
  27. http://users.ece.gatech.edu/mcclella/SPFirst/LectureSlides/SPFirstLectureSlides.html
  28. Convex optimization http://videolectures.net/mlss2011_vandenberghe_convex/
  29. K-SVD Dictionary Learning
  30. Synthesis Sparse Model
  31. redundant representation model
  32. Analysis Dictionary Learning
  33. Matrix completion
  34. Denoising and linear inverse problem
  35. A Primer on WAVELETS and Their Scientific Applications 2nd Ed

Structure of a lesson plan for "Developing Assessment Tools"


Competencies/objectives:
At the completion of this module, the assessor will be able to develop assessment tools

Approximate duration
3 hours

Learning Outcomes
  1. Determine the relevant standards against which the candidate is being assessed
  2. Select assessment method(s) that meet the needs of the candidates and the organisation seeking to assess
  3. Develop assessment tools that will:
    • reflect the principles of assessment
    • incorporate principles of access and equity
    • meet the rules of evidence
    • provide choice, where appropriate
      • are sequenced to reflect competency development
      • are user friendly
      • are practicable
  4. Ensure clear and specific instructions for assessors are included
  5. Take into account storage and retrieval needs of the assessment tool
  6. Review and trial assessment tools to validate their applicability
Content
  1. Determining the standards for assessment
  2. Efficient evidence gathering
  3. Selecting assessment methods using an assessment matrix
  4. Developing assessment tools
  5. Developing assessment criteria
  6. Developing assessment policies
  7. Trialling assessment tools
Delivery strategies
  • Presentation
  • Discussion
  • Activities
Resource requirements
  • ASC Assessor Training Participant Manual
  • ASC Assessor Training Presenter’s Guide

Training and assessment Day 01

I Principles of trainers
Trainers need to design and develop the following.
  1. Objectives of the lesson
  2. Learning activities for the lesson (e.g. content, delivery mode, resources)
  3. Assessment tasks for the lesson
II Principles of assessment
  1. Valid
    a valid assessment assesses what it says it assesses and is undertaken in a situation that matches the workplace requirement.
    i.e. if it is a practical task that you are assessing then you would request that they simulate the task for assessment not write about it.
  2. Reliable
    the methods of assessment clearly show whether the learner has achieved competence. The evidence is real, not opinions or thoughts. Another assessor would also make the same decision.
  3. Flexible
    the methods of assessment reflect the needs and circumstances of the person being assessed and the workplace.
  4. Fair
    you are fair to all those seeking assessment. No-one is disadvantaged by the methods used. The assessment methods and evidence required match with the level of the competency standard.
III Rules of Evidence
  1. The evidence gathered must be valid.
    i.e. if you were assessing a verbal communications competency for example, you may want to see the candidate communicating with a range of different people in a variety of situations and for a number of purposes.
  2. There must be sufficient evidence.
    You need to gather enough evidence to be confident of the candidate’s ability to demonstrate competence; this will also help to ensure that the assessment is reliable (i.e. it can be replicated by another assessor).
  3. The evidence must be authentic.
    Ensure that the work is that of the candidate’s. Direct evidence is the easiest to authenticate though not always possible.
  4. The evidence needs to be current.
    It is important that any supplementary or indirect evidence is recent enough to :
    • Reflect the candidate’s current skills, and
    • Reflect the requirements of the current standards.
IV References and sources

ALTC Engineering and ICT statements
http://www.olt.gov.au/system/files/resources/altc_standards_ENGINEERING_090211.pdf

Aligning teaching for constructing learning, John Biggs
http://www.heacademy.ac.uk/assets/documents/resources/resourcedatabase/id477_aligning_teaching_for_constructing_learning.pdf

Tomorrow's Professor Mailing List
http://www.stanford.edu/dept/CTL/Tomprof/index.shtml

Msg.#1138 Top Ten Workplace Issues for Faculty Members and Higher Education Professionals
http://cgi.stanford.edu/~dept-ctl/cgi-bin/tomprof/posting.php?ID=1138

Tomorrow's Professor Msg.#849 Supporting Student Success Through Scaffolding
http://cgi.stanford.edu/~dept-ctl/cgi-bin/tomprof/posting.php?ID=849

Tomorrow's Professor Msg.#1096 Lose the Lectures
http://cgi.stanford.edu/~dept-ctl/cgi-bin/tomprof/posting.php?ID=1096

Tomorrow's Professor Msg.#498 The Constructivist View of Learning
http://cgi.stanford.edu/~dept-ctl/cgi-bin/tomprof/posting.php?ID=498

Tomorrow's Professor Msg.#555 The Nature of Learning
http://cgi.stanford.edu/~dept-ctl/cgi-bin/tomprof/posting.php?ID=555

Tomorrow's Professor Msg.#121 Tactics for Effective Questioning
http://cgi.stanford.edu/~dept-ctl/cgi-bin/tomprof/posting.php?ID=121

Questions for the revised bloom's taxonomy

Thursday, February 16, 2012

Database Day 01: MySQL Commands

CREATE DATABASE myfirstsqldb;

CREATE TABLE employees (empid int not null, lastname varchar(30), firstname varchar(30), salary float, primary key (empid));

INSERT INTO `myfirstsqldb`.`employees` (`empid`, `lastname`, `firstname`, `salary`) VALUES ('1', 'Smith', 'John', '50000.00');

AlTER TABLE `myfirstsqldb`.`employees` ADD `streetnumber` INT, ADD `streetname` VARCHAR( 30 ), ADD `streettype` VARCHAR(20);


UPDATE `myfirstsqldb`.`employees` SET `streetnumber` = '1', `streetname` = 'Gold', `streettype` = 'Road' WHERE `employees`.`empid` =1;

Image enhancement


  1. Display histogram
    • imhist(I)
  2. Histogram equalization
    • lena_eq=histeq(lena,256);
  3. Contrast-limited adaptive histogram equalization
    • lena_enhanced = adapthisteq(lena,'NumTiles', [8 8] ,'ClipLimit',0.0005); %Higher numbers result in more contrast, local histogram equalization
  4. Histogram sliding
    • lena_enhanced=lena;
    • greater_than_200 = find(lena > 200);
    • lena_enhanced(greater_than_200) = imadd(lena(greater_than_200),10); %shift the image histogram to the right 10 values for all pixel values greater than 200.

Tuesday, February 14, 2012

WPT using Uvi_wave wavelet toolbox

w = wpk2d(lena,h3,g3,basis2d);%transform lena image with analysis filters h3 and g3 with basis2d
pw = insband(0,w,basis2d,1); %set the low frequencies/low-pass coefficients/approximation part to ZERO
p = iwpk2d(pw,rh3,rg3,basis2d); %reconstruct the image of lena with synthesis filters rh3 and rg3
[x1,y1,x2,y2] = siteband(sizx,sizy,basis,1); % locate the low-pass coefficients
wx(y1:y2,x1:x2)=wx(y1:y2,x1:x2) + 10 % adjust low-pass coefficients

where sizx = size(x,2) (number of columns) and sizy = size(x,1) (number of rows).

Format of the basis vector

The vector elements indicating the depth of every terminal node in the tree. The first element always corresponds to the branch performing the most lowpass filter iteration.

basis1d=[Approximation(A), Detail(D)]
basis=[1 3 3 2].

The basis encoding for 2-D signals is similar to the 1-D case. basis vector goes over the terminal nodes of the filter bank tree, holding a value for each node: its depth level in the tree. Notice that a basis format "frequency ordered" has not been considered for the 2-D case.

basis2d = [Approximation (A), Vertical (V), Horizontal (H), Diagonal (D)]
basis = [2 2 2 2  2 2 2 2  1  1].

Monday, February 13, 2012

Saturday, February 11, 2012

Hopfield networks

Hopfield network

Hopfield net consists of a number of nodes, each connected to every other node: it is a fully-connected network.

Hopfield net is a symmetrically-weighted network, since the weights on the link from one node to another are the same in both directions.

Each node has, like the single-layer perceptron, a threshold and a step-function. The nodes calculate the weighted sum of their inputs minus the threshold value, passing that through the step function to determine their output state.

The net takes only 2-state inputs-these can be binary (0,1) or bipolar (-1,+1).

Inputs to the network are applied to all nodes at once, and consist of a set of starting values, +1 or -1.

The output of the network is taken to be the value of all the nodes when the network has reached a stable, steady state.

Stochastic Hopfield network


The boltzmann machine

Recurrent neural networks


Forecasting using linear neuron models demonstrates that with input lags, these models are equivalent to auto regressive (AR) models in classical time-series modeling. When error lags are introduced to this model, it represents an autoregressive moving average (ARMA) model.

Nonlinear networks for time-series forecasting use modified backpropagation with short-term memory filters, represented by input lags (focused time-lagged feedforward networks) and recurrent networks with feedback loops, which capture the long-term memory dynamics of a time-series and embed it in the network structure itself.

The results show that use of a nonlinear network improves short-term and long-term forecasting ability compared to linear models.

Recurrent networks are variants of nonlinear auto regressive (NAR) models. When error is incorporated as input, they become nonlinear auto regressive integrated moving average (NARIMA) models and NARIMAx models with exogeneous inputs.

The recurrent networks outperform focused time-lagged feedforward networks and are more robust in long-term forecasting.

There are three types of recurrent networks
  1. Elman networks: the hidden-layer activation is fed back as an input at the next time step.
  2. Jordan networks: the output is fed back as an input.
  3. Fully recurrent networks:  when both the output layer and the hidden layer feed their delayed outputs back to themselves.
The ability to efficiently select the most relevant inputs from a large set of correlated and redundant inputs.

References
  1. S. Samarasinghe, Neural Networks for Applied Sciences and Engineering: From Fundamentals to Complex Pattern Recognition, Auerbach Publications, 2007, Chapter 9.

Associative memory network types

If we can associate a pattern with itself by making the input and output patterns the same, then a presentation of an incomplete pattern on the input will result in the recall of the complete pattern. This memory is called auto-associative.

If the input pattern is taught in association with a different output pattern, then the presentation of this input will cause the corresponding pattern to appear on the output. This memory is called hetero-associative.

1. Auto-associative X=Y
    Recognize noisy versions of a pattern
    Example: Hopfield Memory Network

2. Hetero-associative bidirectional: X<>Y
    Iterative correction of input and output
    Example: BAM=Bidirectional Associative Memory

3. Hetero-associative input correcting: X<>Y
    Input clique (circle) is auto-associative => repairs input patterns

4. Hetero-associative output correcting: X<>Y
    Output clique (circle) is auto-associative +> repairs output patterns

Hebbian learning rule states that
"Two neurons which are simultaneously active should develop a degree of interaction higher that those neurons whose activities are uncorrelated."

The Hopfield and BAM nets have two major limitations:
  1. The number of patterns that can be stored and accurately recalled is a function of the number of nodes in the net.
  2. There is a problem in recalling a correct pattern when two input patterns share so many identical features (e.g. too many pixels values in common).
Weight matrix representation

Hetero association: W=Sum of (X_transpose x Y)
Auto association: W=Sum of (X_transpose x X)

where X=matrix of input patterns, where each row is a pattern.
          Y=matrix of output patterns, where each row is a pattern.

Ideally, the weights in the weight matrix W will record the average correlations across all patterns.





Friday, February 10, 2012

LAMP Terminologies

log in to mysql server mysql -u root -p
manage mysql server http://localhost/phpmyadmin/
set up password for root account mysqladmin -u root -p password yourpassword
create a database named mydb mysqladmin -u root -p create mydb
More information can be found at https://help.ubuntu.com/community/ApacheMySQLPHP

See system default values for a new user #/usr/sbin/useradd -D
Change the default shell #useradd -D -s /bin/bash
Add a new user # /usr/sbin/useradd test
Remove a user # /usr/sbin/userdel -r test
Change SHELL for user temp # chsh -s /bin/csh temp

Image restoration/recovery terminologies

I. Terminologies
  1. Image recovery
  2. Image restoration
  3. Image deconvolution
  4. Image demosaicking
  5. Image pansharpening, 
  6. Image inpainting, 
  7. Image deblocking. 
  8. Point-spread function (PSF)
  9. Optical transfer function
  10. De-convolution problem
  11. Input distribution
  12. Output distribution
  13. An inverse filter
  14. Noise spectrum
  15. Gaussian PSF
  16. Wiener–Helstrom filter
  17. Bayesian estimators
  18. Least-squares estimators
  19. Least-squares restoration
  20. Blind deconvolution
  21. Maximum-likelihood blind de-convolution
II. Noise reduction in spatial domain

Statistic filters a.k.a rank filters or order filters
  1. Median filter: select the middle pixel value from the ordered set of values within the m x n neighbourhood (W) to replace the reference pixel.
  2. Min filter used to replace the reference pixel with the minimum value of the ordered set
  3. Max filter used to replace the reference pixel with the maximum value of the ordered set. The min filter is useful for reduction of salt noise, whereas the max filter can help remove pepper noise.
  4. Midpoint filter used to replace the reference pixel with the average of the highest and lowest pixel values within a window. It is used to reduce Gaussian and uniform noise in images.
  5. Alpha-trimmed mean filters use another combination of order statistics and averaging, in this case an average of the pixel values closest to the median, after the D lowest and D highest values in an ordered set have been excluded. They are used when images are corrupted by more than one type of noise.
Adaptive filters such as edge-preserving smoothing filters. They aim to apply a low-pass filter to an image in a selective way, minimising the edge blurring effect that would be present if a standard LPF had been applied to the image.

III. Noise reduction using the frequency domain techniques

The following filters can remove periodic noise.
  1. Bandpass filter
  2. Bandreject filter
  3. Notch filter
  4. Image deblurring with the following Matlab commands: 
    1. deconvreg: deblur image using regularized filter
    2. deconvlucy: deblur image using Lucy–Richardson method
    3. deconvblind: deblur image using blind deconvolution
  5. Inverse filtering
  6. Wiener Filtering (Matlab command deconvwnr)
    The Wiener filter can be used to restore images in the presence of blurring only (i.e., without noise). In such cases, the best results are obtained for lower values of K.
IV. Image Recovery Examples

  1. Defocusing
  2. Space image restoration
  3. Cross-channel degradation (e.g. original color image -> black and white image -> color image)
  4. Blind Spartically Varying Restoration
  5. Blocking artifact removal
  6. Video blocking artifact removal
  7. Error concealment
  8. Inpainting
  9. Image super-resolution
  10. Compressed video super-resolution
  11. Dual exposure restoration
  12. Pansharpening problem
  13. Demosaicking
  14. Tracking blurred objects
V. Sources of degradation
  1. motion
  2. atmospheric turbulence
  3. out-of-focus lens
  4. limitations of acquisition systems (optics, physics, cost, etc)
  5. finite resolution of sensors
  6. quantisation errors
  7. transmisison erros
  8. noise  

VI. Forms of the recovery problem
  1. Noise smoothing
  2. Restoration/deconvolution (1D,2D,3D): multi-spectral, multi-channel
  3. Removal of compression artifacts
  4. Super-resolution: pansharpening, demosaicking
  5. Inpainting, concealment
  6. Dual exposure imaging
  7. Reconstruction from Projections
  8. Compressive sensing
  9. Light-field Reconstruction
  10. Spatially adaptive constrained least-squares restoration filter

VII. MATLAB Commands
  1. fftshift Shift zero-frequency component to centre of spectrum
  2. edgetaper Taper discontinuities/smooth along image edges
  3. deconvblind Deblur image using blind deconvolution
  4. fspecial Create predefined 2-D filter
  5. medfilt2: median filter
  6. ordfilt2: order filters use to apply min, max, median filters on corrupted images
  7. nlfilter or colfilt are sliding window neighbourhood operations. One of these two functions can be used. colfilt is considerably faster than nlfilter.
  8. This example gives the same result as using medfilt2 command with a 3-by-3 neighbourhood. 
    A = imread('cameraman.tif');
    fun = @(x) median(x(:)); %in-line function declaration
    B = nlfilter(A,[3 3],fun); %apply a sliding window neighbourhood operation
    imshow(A), figure, imshow(B)
Arithmetic mean filters show satisfactory performance for images corrupted by Gaussian noise.

Median filters perform well with images corrupted by salt and pepper noise.

Harmonic mean filter is a variation of the mean filter and is good for salt and Gaussian noise. However, it fails to work on pepper noise.

Geometric mean filter can preserve image detail better than the arithmetic mean filter and work best on Gaussian noise.

VIII. References
  1. Oge Marques, Practical Image and Video Processing Using MATLAB, Wiley-IEEE Press, September 2011.

Thursday, February 9, 2012

Stereo vision

Stereo matching
A disparity map tells how far each point in the physical scene was from the camera
A range image

Image rectification is the process of projecting multiple images onto a common image plane and aligning their coordinate systems.

Image pyramiding
General epipolar lines
Contouring effects refer to where there are no smooth transitions between regions of different disparity
Finding the optimal disparity estimates for a row of pixels
Striation pattern
Multiple view geometry
Camera intrinsics matrix
Stereo depth map

Wednesday, February 8, 2012

Scientific Software Developer job requirements

Experience with GUI development, scientific visualisation, OpenMP and MKL parallel software libraries including BLAS, LAPACK and FFT libraries would be an advantage, as would experience with using MATLAB.

Knowledge or experience with some or all of the following would be an advantage: time series analysis, Fourier and Z transform methods, SVD, Eigen-Pattern-Analysis, statistics, nonlinear methods, complex systems, and power law phenomena.

Knowledge and/or experience with advanced optimisation methods such as nonlinear least squares, Monte Carlo methods, Simulated Annealing, Genetic Algorithms, and Neural Networks will be an advantage.

Mounting USB drives in Windows Subsystem for Linux

Windows Subsystem for Linux can use (mount): SD card USB drives CD drives (CDFS) Network drives UNC paths Local storage / drives Drives form...