Reading the paper "A context free TAG Variant" from Charniak's group. Finally got some idea of what TAG, TSG (Tree-Substitution Grammar) are.
One intuition that back TAG is: for some structure, if adjunction were undone, the remaining derivation would be a valid sentence that simply lacks the modifying structure of the auxiliary tree.
The grammar is more compact.
Tuesday, November 12, 2013
Friday, November 8, 2013
gdb Tutorial
from http://www.cs.cmu.edu/~gilpin/tutorial/
Go ahead and make the program for this tutorial, and run the program. The program will print out some messages, and then it will print that it has received a segmentation fault signal, resulting in a program crash. Given the information on the screen at this point, it is near impossible to determine why the program crashed, much less how to fix the problem. We will now begin to debug this program.
If you look at the output from running the program, you will see first of all that the program runs without crashing, but there is a memory leak somewhere in the program. (Hint: It is in the LinkedList<T>::remove() function. One of the cases for remove doesn't work properly.) It is left as an exercise to the reader to use the debugger in locating and fixing this bug. (I've always wanted to say that. ;)
further notes:
If you program needs argument, then use:
gdb --args path/to/executable -every arg you can think <out.txt
To trace back the error: (gdb) backtrace
Introduction
This tutorial was originally written for CS 342 at Washington University. It is still maintained by Andrew Gilpin.Who should read this?
This tutorial is written to help a programmer who is new to the Unix environment to get started with using thegdb
debugger. This
tutorial assumes you already know how to program in C++ and you can compile and
execute programs. It also sort of assumes that you basically know what
debugging is and that you have used a debugger on another system.
Source code
To help illustrate some of the debugging principles I will use a running example of a buggy program. As you progress through this tutorial, you will use the debugger to locate and fix errors in the code. The code can be downloaded here and a simple Makefile for the program can be downloaded here. The code is very simple and consists of two class definitions, a node and a linked list. There is also a simple driver to test the list. All of the code was placed into a single file to make illustrating the process of debugging a little easier.Preparations
Environment settings
gdb
is in the gnu package on CEC machines. If you don't
have this package loaded then type pkgadd gnu
at a shell
prompt. If you can run g++
, then you will be able to run gdb
.
Debugging symbols
gdb
can only use debugging symbols that are generated by
g++
. For Sun CC users, there is the dbx
debugger
which is very similar to gdb
.
gdb
is most effective when it is debugging a program that has
debugging symbols linked in to it. With g++
, this is accomplished
using the -g
command line argument.
For even more information, the -ggdb
switch
can be used which includes debugging symbols which are specific to
gdb
. The makefile for this tutorial uses the
-ggdb
switch.
Debugging
When to use a debugger
Debugging is something that can't be avoided. Every programmer will at one point in their programming career have to debug a section of code. There are many ways to go about debugging, from printing out messages to the screen, using a debugger, or just thinking about what the program is doing and making an educated guess as to what the problem is. Before a bug can be fixed, the source of the bug must be located. For example, with segmentation faults, it is useful to know on which line of code the seg fault is occuring. Once the line of code in question has been found, it is useful to know about the values in that method, who called the method, and why (specifically) the error is occuring. Using a debugger makes finding all of this information very simple.Go ahead and make the program for this tutorial, and run the program. The program will print out some messages, and then it will print that it has received a segmentation fault signal, resulting in a program crash. Given the information on the screen at this point, it is near impossible to determine why the program crashed, much less how to fix the problem. We will now begin to debug this program.
Loading a program
So you now have an executable file (in this casemain
) and you
want to debug it. First you must launch the debugger. The debugger is called
gdb
and you can tell it which file to debug at the shell prompt.
So to debug main
we want to type gdb main
. Here is
what it looks like when I run it:
agg1@sukhoi agg1/.www-docs/tutorial> gdb main GNU gdb 4.18 Copyright 1998 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "sparc-sun-solaris2.7"... (gdb)(Note: If you are using Emacs, you can run
gdb
from within Emacs by
typing M-x gdb. Then Emacs will split into two windows, where the second
window will show the source code with a cursor at the current instruction. I
haven't actually used gdb this way, but I have been told by a very reliable
source that this will work. :)
gdb
is now waitng for the user to type a command. We need to
run the program so that the debugger can help us see what happens when
the program crashes. Type run
at the (gdb)
prompt.
Here is what happens when I run this command:
(gdb) run Starting program: /home/cec/s/a/agg1/.www-docs/tutorial/main Creating Node, 1 are in existence right now Creating Node, 2 are in existence right now Creating Node, 3 are in existence right now Creating Node, 4 are in existence right now The fully created list is: 4 3 2 1 Now removing elements: Creating Node, 5 are in existence right now Destroying Node, 4 are in existence right now 4 3 2 1 Program received signal SIGSEGV, Segmentation fault. Node<int>::next (this=0x0) at main.cc:28 28 Node<T>* next () const { return next_; } (gdb)The program crashed so lets see what kind of information we can gather.
Inspecting crashes
So already we can see the that the program was at line 28 of main.cc, thatthis
points to 0, and we can see the line of code that was executed. But
we also want to know who called this method and we would like to be able to
examine values in the calling methods. So at the gdb
prompt,
we type backtrace
which gives me the following output:
(gdb) backtrace #0 Node<int>::next (this=0x0) at main.cc:28 #1 0x2a16c in LinkedList<int>::remove (this=0x40160, item_to_remove=@0xffbef014) at main.cc:77 #2 0x1ad10 in main (argc=1, argv=0xffbef0a4) at main.cc:111 (gdb)So in addition to what we knew about the current method and the local variables, we can now also see what methods called us and what their parameters were. For example, we can see that we were called by
LinkedList<int>::remove ()
where the parameter
item_to_remove
is at address 0xffbef014
. It
may help us to understand our bug if we know the value of
item_to_remove
, so we want to see the value at the
address of item_to_remove
. This can be done using the
x
command using the address as a parameter. ("x" can be
thought of as being short for "examine".) Here is what happens when I
run the command:
(gdb) x 0xffbef014 0xffbef014: 0x00000001 (gdb)So the program is crashing while trying to run
LinkedList<int>::remove
with a parameter of 1. We have now
narrowed the problem down to a specific function and a specific value for
the parameter.
Conditional breakpoints
Now that we know where and when the segfault is occuring, we want to watch what the program is doing right before it crashes. One way to do this is to step through, one at a time, every statement of the program until we get to the point of execution where we want to see what is happening. This works, but sometimes you may want to just run to a particular section of code and stop execution at that point so you can examine data at that location. If you have ever used a debugger you are probably familiar with the concept of breakpoints. Basically, a breakpoint is a line in the source code where the debugger should break execution. In our example, we want to look at the code inLinkedList<int>::remove ()
so we would want to set a
breakpoint at line 52 of main.cc. Since you may not know the exact line
number, you can also tell the debugger which function to break in. Here is
what we want to type for our example:
(gdb) break LinkedList<int>::remove Breakpoint 1 at 0x29fa0: file main.cc, line 52. (gdb)So now Breakpoint 1 is set at main.cc, line 52 as desired. (The reason the breakpoint gets a number is so we can refer to the breakpoint later, for example if we want to delete it.) So when the program is run, it will return control to the debugger everytime it reaches line 52. This may not be desirable if the method is called many times but only has problems with certain values that are passed. Conditional breakpoints can help us here. For our example, we know that the program crashes when
LinkedList<int>::remove()
is called with a value of
1. So we might want to tell the debugger to only break at line 52 if
item_to_remove
is equal to 1. This can be done by issuing
the following command:
(gdb) condition 1 item_to_remove==1 (gdb)This basically says "Only break at Breakpoint 1 if the value of
item_to_remove
is 1." Now we can run the program and know that
the debugger will only break here when the specified condition is true.
Stepping
Continuing with the example above, we have set a conditional breakpoint and now want to go through this method one line at a time and see if we can locate the source of the error. This is accomplished using thestep
command.
gdb
has the nice feature that when enter is pressed without
typing a command, the last command is automatically used. That way we can step
through by simply tapping the enter key after the first step
has been entered. Here is what this looks like:
(gdb) run The program being debugged has been started already. Start it from the beginning? (y or n) y Starting program: /home/cec/s/a/agg1/.www-docs/tutorial/main Creating Node, 1 are in existence right now Creating Node, 2 are in existence right now Creating Node, 3 are in existence right now Creating Node, 4 are in existence right now The fully created list is: 4 3 2 1 Now removing elements: Creating Node, 5 are in existence right now Destroying Node, 4 are in existence right now 4 3 2 1 Breakpoint 1, LinkedList<int>::remove (this=0x40160, item_to_remove=@0xffbef014) at main.cc:52 52 Node<T> *marker = head_; (gdb) step 53 Node<T> *temp = 0; // temp points to one behind as we iterate (gdb) 55 while (marker != 0) { (gdb) 56 if (marker->value() == item_to_remove) { (gdb) Node<int>::value (this=0x401b0) at main.cc:30 30 const T& value () const { return value_; } (gdb) LinkedList<int>::remove (this=0x40160, item_to_remove=@0xffbef014) at main.cc:75 75 marker = 0; // reset the marker (gdb) 76 temp = marker; (gdb) 77 marker = marker->next(); (gdb) Node<int>::next (this=0x0) at main.cc:28 28 Node<T>* next () const { return next_; } (gdb) Program received signal SIGSEGV, Segmentation fault. Node<int>::next (this=0x0) at main.cc:28 28 Node<T>* next () const { return next_; } (gdb)After typing
run
, gdb
asks us if we want to restart
the program, which we do. It then proceeds to run and breaks at the
desired location in the program. Then we type step
and proceed
to hit enter to step through the program. Note that the debugger steps into
functions that are called. If you don't want to do this, you can use
next
instead of step
which otherwise has the same
behavior.
The error in the program is obvious.
At line 75 marker is set to 0, but at line 77 a member of marker is accessed.
Since the program can't access memory location 0, the seg fault occurs. In
this example, nothing has to be done to marker and the error can be avoided
by simply removing line 75 from main.cc.
If you look at the output from running the program, you will see first of all that the program runs without crashing, but there is a memory leak somewhere in the program. (Hint: It is in the LinkedList<T>::remove() function. One of the cases for remove doesn't work properly.) It is left as an exercise to the reader to use the debugger in locating and fixing this bug. (I've always wanted to say that. ;)
gdb
can be exited by typing quit
.
Further information
This document only covers the bare minimum number of commands necessary to get started usinggdb
. For more information about gdb
see the
gdb
man page or take a look at a very long description of
gdb
here.
Online help can be accessed by typing help
while running
gdb
. Also, as always, feel free to ask questions on the
newsgroup or you can ask me during lab hours.
Notes
- There is another bug in the source code for the linked list that is not mentioned in the above code. The bug does not show up for the sequence of inserts and removes that are in the provided driver code, but for other sequences the bug shows up. For example, inserting 1, 2, 3, and 4, and then trying to remove 2 will show the error. Special thanks to Linda Gu and Xiaofeng Chen for locating this bug. The bug fix is pretty simple and is left as an exercise.
- Special thanks to Ximmbo da Jazz for providing valuable fixes for some typos and erroneous output.
- Special thanks to Raghuprasad Govindarao for discovering a broken link.
further notes:
If you program needs argument, then use:
gdb --args path/to/executable -every arg you can think <out.txt
To trace back the error: (gdb) backtrace
Tuesday, October 8, 2013
Learned something new
From the guy with Chris. constituent parsing is not as good as dependency parsing for languages that don't have fixed word orders, such as Russian.
New idea: maybe use tree ajoining grammar for my task. If the tree which includes three parts is created, then we don't need parsing.
New idea: maybe use tree ajoining grammar for my task. If the tree which includes three parts is created, then we don't need parsing.
Friday, September 20, 2013
How brain works?
Watching the machine learning course about Neural network. Andrew mentioned that the Somatosensory is for touch sense, but if the link is rewired, it can learn to see. But I remember I read some cognitive linguistic book, and it mentions that some part of brain damage will cause gramar learning problem, can we rewire some neurons to learn this too?
Sunday, August 25, 2013
[paper read] No country for old members: User lifecycle and linguistic change in online communities
Use language models to analyze the user-level and community level language changes.
Claims that user's susceptibility(敏感性)to lexicon innovation increases in the adolescence and then drops in later community life span.
Claims that user's susceptibility(敏感性)to lexicon innovation increases in the adolescence and then drops in later community life span.
Thursday, August 1, 2013
Reading "The language instinct" by steven pinker
Chapter 1
The author's opinion: language ability is instinct of human. Just like spinning web of spiders. Not molded by the surrounding culture.
The idea is deeply influenced by Chomsky.
If this is true, computer scientist will like it very much. If reinforcement learning can make machine learn to balance, to walk. Maybe one day, machines can learn to talk.
Question: but why some words only occurs in some cultures. Such as "幸灾乐祸" in Chinese and "gloat" in Germany. It can be translated into "to take a mischievous pleasure in sth."
Chapter2
The author's opinion: innate language. One evidence is pidgin to Creole. So children can automatically create grammar? I thought they learned from parents, surrounding people. But the author's opinion is that even just given pidgin by parents, they can make the grammar more perfect->Creole.
"Grammatical" means well-formed according to consistent rules in the dialect of the speakers.
Chapter 4 How language works
Talks about the Grammar of the languages, the author's opinion is that grammar is not learned.
Some funny jokes caused by mixing entries of verbs:
"call me a taxi." "OK, you are a taxi"
"We don't serve colored people."
"That's fine."
"I don't eat colored people. I'd like a piece of chicken."
Wednesday, July 31, 2013
C++中值传递、指针传递和引用传递的比较
C++中值传递、指针传递和引用传递的比较
from: http://blog.csdn.net/jimmy54/article/details/4748928
C++引用与指针的比较引用是C++中的概念,初学者容易把引用和指针混淆一起。
一下程序中,n是m的一个引用(reference),m是被引用物(referent)。
int m;
int &n = m;
n相当于m的别名(绰号),对n的任何操作就是对m的操作。
所以n既不是m的拷贝,也不是指向m的指针,其实n就是m它自己。
引用的规则:
(1)引用被创建的同时必须被初始化(指针则可以在任何时候被初始化)。
(2)不能有NULL引用,引用必须与合法的存储单元关联(指针则可以是NULL)。
(3)一旦引用被初始化,就不能改变引用的关系(指针则可以随时改变所指的对象)。
以下示例程序中,k被初始化为i的引用。
语句k = j并不能将k修改成为j的引用,只是把k的值改变成为6。
由于k是i的引用,所以i的值也变成了6。
int i = 5;
int j = 6;
int &k = i;
k = j; // k和i的值都变成了6;
引用的主要功能是传递函数的参数和返回值。
C++语言中,函数的参数和返回值的传递方式有三种:
值传递、指针传递和引用传递。
以下是"值传递"的示例程序。
由于Func1函数体内的x是外部变量n的一份拷贝,改变x的值不会影响n, 所以n的值仍然是0。
void Func1(int x)
{
x = x + 10;
}
...
int n = 0;
Func1(n);
cout << "n = " << n << endl; // n = 0
以下是"指针传递"的示例程序。
由于Func2函数体内的x是指向外部变量n的指针,改变该指针的内容将导致n的值改变,所以n的值成为10。
void Func2(int *x)
{
(* x) = (* x) + 10;
}
...
int n = 0;
Func2(&n);
cout << "n = " << n << endl; // n = 10
以下是"引用传递"的示例程序。
由于Func3函数体内的x是外部变量n的引用,x和n是同一个东西,改变x等于改变n,所以n的值成为10。
void Func3(int &x) {
x = x + 10;
}
...
int n = 0;
Func3(n); cout << "n = " << n << endl; // n = 10
对比上述三个示例程序,会发现"引用传递"的性质象"指针传递",而书写方式象"值传递"。
Monday, July 29, 2013
Thursday, July 25, 2013
c++ extern
From:http://www.gamedev.net/page/resources/_/technical/general-programming/organizing-code-files-in-c-and-c-r1798
When the linker comes to create an executable (or library) from your code, it takes all the object (.obj or .o) files, one per translation unit, and puts them together. The linker's main job is to resolve identifiers (basically, variables or functions names) to machine addresses in the file. This is what links the various object files together. The problem arises when the linker finds two or more instances of that identifier in the object files, as then it cannot determine which is the 'correct' one to use. The identifier should be unique to avoid any such ambiguity. So how come the compiler doesn't see an identifier as being duplicated, yet the linker does?
Imagine the following code:
int SomeFunction(int parameter);Functions are considered 'extern' by default so it is customary to omit the 'extern' in a function prototype.
Of course, these are just declarations that my_global and SomeFunction exist somewhere. It doesn't actually create them. You still have to do this in one of the source files, as otherwise you will see a new linker error when it finds it cannot resolve one of the identifiers to an actual address. So for this example, you would add "int my_global" to either Code1.cpp or Code2.cpp, and everything should work fine. If it was a function, you'd add the function including its body (ie. the code of the function) into one of the source files.
The rule here is to remember that header files define an interface, not an implementation. They specify which functions, variables, and objects exist, but it is not responsible for creating them. They may say what a struct or class must contain, but it shouldn't actually create any instances of that struct or class. They can specify what parameters a function takes and what it returns, but not how it gets the result. And so on. This is why the list of what can go into a header file earlier in this article is important.
When the linker comes to create an executable (or library) from your code, it takes all the object (.obj or .o) files, one per translation unit, and puts them together. The linker's main job is to resolve identifiers (basically, variables or functions names) to machine addresses in the file. This is what links the various object files together. The problem arises when the linker finds two or more instances of that identifier in the object files, as then it cannot determine which is the 'correct' one to use. The identifier should be unique to avoid any such ambiguity. So how come the compiler doesn't see an identifier as being duplicated, yet the linker does?
Imagine the following code:
/* Header.h */ #ifndef INC_HEADER_H #define INC_HEADER_H int my_global; #endif /* INC_HEADER_H */ /* code1.cpp */ #include "header1.h" void DoSomething() { ++my_global; } /* code2.cpp */ #include "header1.h" void DoSomethingElse() { --my_global; }
This first gets compiled into two object files, probably called
code1.obj and code2.obj. Remember that a translation unit contains full
copies of all the headers included by the file you are compiling.
Finally, the object files are combined to produce the final file.
Notice how there are two copies of "my_global" in that final block.
Although "my_global" was unique for each translation unit (this would be
assured by the use of the inclusion guards), combining the object files
generated from each translation unit would result in there being more
than one instance of my_global in the file. This is flagged as an error,
as the linker has no way of knowing whether these two identifiers are
actually same one, or if one of them was just misnamed and they were
actually supposed to be 2 separate variables. So you have to fix it.
The
answer is not to define variables or functions in headers. Instead, you
define them in the source files where you can be sure that they will
only get compiled once (assuming you don't ever #include any source
files, which is a bad idea for exactly this reason). This gives you a
new problem: how do you make the functions and variables globally
visible if they aren't in a common header any more? How will other files
"see" them? The answer is to declare the functions and variables in the
header, but not to define them. This lets the compiler know that the
function or variable exists, but delegates the act of resolving the
address to the linker.
To do this for a variable, you add the keyword 'extern' before its name:
extern int my_global;The
'extern' specifier is like telling the compiler to wait until link time
to resolve the 'connection'. And for a function, you just put the
function prototype:int SomeFunction(int parameter);Functions are considered 'extern' by default so it is customary to omit the 'extern' in a function prototype.
Of course, these are just declarations that my_global and SomeFunction exist somewhere. It doesn't actually create them. You still have to do this in one of the source files, as otherwise you will see a new linker error when it finds it cannot resolve one of the identifiers to an actual address. So for this example, you would add "int my_global" to either Code1.cpp or Code2.cpp, and everything should work fine. If it was a function, you'd add the function including its body (ie. the code of the function) into one of the source files.
The rule here is to remember that header files define an interface, not an implementation. They specify which functions, variables, and objects exist, but it is not responsible for creating them. They may say what a struct or class must contain, but it shouldn't actually create any instances of that struct or class. They can specify what parameters a function takes and what it returns, but not how it gets the result. And so on. This is why the list of what can go into a header file earlier in this article is important.
Tuesday, July 23, 2013
Eclipse+CDT
Two days of work, finally makes it work for Eclipse+CDT。
If you want to use your own makefile, create new project -> c++ project -> Makefile Project -> empty project.
If you want to debug, you need to compile with g++ -g in makefile.
When you debug in Eclipse, if something is wrong, it can be show in the Console, choose gdb or gdb traces view.
In the trace, you will know where is the error: such as:
058,863 306^done,stack=[frame={level="0",addr="0x000000000040593e",func="std::vector<std::vector<dou\
ble, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::size\
",file="/usr/include/c++/4.6/bits/stl_vector.h",fullname="/usr/include/c++/4.6/bits/stl_vector.h",li\
ne="571"},frame={level="1",addr="0x000000000040506e",func="std::vector<std::vector<double, std::allo\
cator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::vector",file="/usr\
/include/c++/4.6/bits/stl_vector.h",fullname="/usr/include/c++/4.6/bits/stl_vector.h",line="279"},fr\
ame={level="2",addr="0x0000000000404b21",func="NodeVector::getVector",file="NodeVector.h",fullname="\
/home/ying/study/SocialNetwork/ConstParse/SVMKernelCV2/NodeVector.h",line="32"},frame={level="3",add\
r="0x0000000000413d62",func="compute_matrices_for_optimization",file="svm_learn.cpp",fullname="/home\
/ying/study/SocialNetwork/ConstParse/SVMKernelCV2/svm_learn.cpp",line="1967"},frame={level="4",addr=\
"0x0000000000413a6b",func="optimize_svm",file="svm_learn.cpp",fullname="/home/ying/study/SocialNetwo\
rk/ConstParse/SVMKernelCV2/svm_learn.cpp",line="1900"},frame={level="5",addr="0x0000000000412a5f",fu\
nc="optimize_to_convergence",file="svm_learn.cpp",fullname="/home/ying/study/SocialNetwork/ConstPars\
e/SVMKernelCV2/svm_learn.cpp",line="1644"},frame={level="6",addr="0x000000000040ebe5",func="svm_lear\
n_classification",file="svm_learn.cpp",fullname="/home/ying/study/SocialNetwork/ConstParse/SVMKernel\
CV2/svm_learn.cpp",line="835"},frame={level="7",addr="0x0000000000402b69",func="main",file="svm_lear\
n_main.cpp",fullname="/home/ying/study/SocialNetwork/ConstParse/SVMKernelCV2/svm_learn_main.cpp",lin\
e="110"}]
If you want to use your own makefile, create new project -> c++ project -> Makefile Project -> empty project.
If you want to debug, you need to compile with g++ -g in makefile.
When you debug in Eclipse, if something is wrong, it can be show in the Console, choose gdb or gdb traces view.
In the trace, you will know where is the error: such as:
058,863 306^done,stack=[frame={level="0",addr="0x000000000040593e",func="std::vector<std::vector<dou\
ble, std::allocator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::size\
",file="/usr/include/c++/4.6/bits/stl_vector.h",fullname="/usr/include/c++/4.6/bits/stl_vector.h",li\
ne="571"},frame={level="1",addr="0x000000000040506e",func="std::vector<std::vector<double, std::allo\
cator<double> >, std::allocator<std::vector<double, std::allocator<double> > > >::vector",file="/usr\
/include/c++/4.6/bits/stl_vector.h",fullname="/usr/include/c++/4.6/bits/stl_vector.h",line="279"},fr\
ame={level="2",addr="0x0000000000404b21",func="NodeVector::getVector",file="NodeVector.h",fullname="\
/home/ying/study/SocialNetwork/ConstParse/SVMKernelCV2/NodeVector.h",line="32"},frame={level="3",add\
r="0x0000000000413d62",func="compute_matrices_for_optimization",file="svm_learn.cpp",fullname="/home\
/ying/study/SocialNetwork/ConstParse/SVMKernelCV2/svm_learn.cpp",line="1967"},frame={level="4",addr=\
"0x0000000000413a6b",func="optimize_svm",file="svm_learn.cpp",fullname="/home/ying/study/SocialNetwo\
rk/ConstParse/SVMKernelCV2/svm_learn.cpp",line="1900"},frame={level="5",addr="0x0000000000412a5f",fu\
nc="optimize_to_convergence",file="svm_learn.cpp",fullname="/home/ying/study/SocialNetwork/ConstPars\
e/SVMKernelCV2/svm_learn.cpp",line="1644"},frame={level="6",addr="0x000000000040ebe5",func="svm_lear\
n_classification",file="svm_learn.cpp",fullname="/home/ying/study/SocialNetwork/ConstParse/SVMKernel\
CV2/svm_learn.cpp",line="835"},frame={level="7",addr="0x0000000000402b69",func="main",file="svm_lear\
n_main.cpp",fullname="/home/ying/study/SocialNetwork/ConstParse/SVMKernelCV2/svm_learn_main.cpp",lin\
e="110"}]
Friday, July 19, 2013
Stanford tool
The tree structure created by Stanford parser has some useful information:
offset: (If not, you need setSpan for the tree.)
Label label = tree.label();
int offset1 = ((HasOffset)label).beginPosition();
int offset2 = ((HasOffset)label).endPosition();
public static int setSpan(Tree t, int left){
Label label = (Label)t.label();
if (t.isPreTerminal()) {
((HasOffset)label).setBeginPosition(left);
((HasOffset)label).setEndPosition(left);
String labelS = label.value();
if(labelS.contains("-NONE-") || labelS.contains("-none-")){
return left;
}
return (left + 1);
}
int position = left;
// enumerate through daughter trees
Tree[] kids = t.children();
for (Tree kid : kids)
position =setSpan( kid,position);
//Parent span
((HasOffset)label).setBeginPosition(left);
((HasOffset)label).setEndPosition(position - 1);
return position;
}
It can get the head word of every node:
String headWord = ((HasWord)(tree.label())).word();
offset: (If not, you need setSpan for the tree.)
Label label = tree.label();
int offset1 = ((HasOffset)label).beginPosition();
int offset2 = ((HasOffset)label).endPosition();
public static int setSpan(Tree t, int left){
Label label = (Label)t.label();
if (t.isPreTerminal()) {
((HasOffset)label).setBeginPosition(left);
((HasOffset)label).setEndPosition(left);
String labelS = label.value();
if(labelS.contains("-NONE-") || labelS.contains("-none-")){
return left;
}
return (left + 1);
}
int position = left;
// enumerate through daughter trees
Tree[] kids = t.children();
for (Tree kid : kids)
position =setSpan( kid,position);
//Parent span
((HasOffset)label).setBeginPosition(left);
((HasOffset)label).setEndPosition(position - 1);
return position;
}
It can get the head word of every node:
String headWord = ((HasWord)(tree.label())).word();
Thursday, July 18, 2013
Using Gaussian kernel for svm-light.
Sometimes it converges very slow. Someone suggests the parameters are not optimized.
So I tuned r in Gaussian kernel. Slow means sensitive to noise, and I need to increase r. More about Gaussian kernel in
http://crsouza.blogspot.ca/2010/03/kernel-functions-for-machine-learning.html
Sometimes it converges very slow. Someone suggests the parameters are not optimized.
So I tuned r in Gaussian kernel. Slow means sensitive to noise, and I need to increase r. More about Gaussian kernel in
http://crsouza.blogspot.ca/2010/03/kernel-functions-for-machine-learning.html
Wednesday, July 17, 2013
Deep Learning.
Start watching the course of neural network. Saw something fun today. He thinks SVM is not a good model for tasks that need structures. And he thinks backpropagation failed in the 90s, because of computer speed and small data.
He said he did not like methods such as retreating to models that allow convex optimization. mmm, isn't this what we learn in Dale's course?
1970-80s, anti-probability, LOL.
Some book I should read: John von Neumann, the computer and the brain. 1958.
Start watching the course of neural network. Saw something fun today. He thinks SVM is not a good model for tasks that need structures. And he thinks backpropagation failed in the 90s, because of computer speed and small data.
He said he did not like methods such as retreating to models that allow convex optimization. mmm, isn't this what we learn in Dale's course?
1970-80s, anti-probability, LOL.
Some book I should read: John von Neumann, the computer and the brain. 1958.
Subscribe to:
Posts (Atom)