![]() |
|
![]() |
|
|
Thread Tools | Display Modes |
|
|
#1 |
|
Expert Programmer
|
Combining languages
How difficult is it to combine two programming languages to create a single application? Particularly, how would you go about this with C++ and Python? Thanks.
|
|
|
|
|
|
#2 |
|
Expert Programmer
Join Date: Dec 2004
Posts: 794
Rep Power: 4
![]() |
look into something called SWIG, if you want to use C++ functions in Python. I'm not sure how you would go about doing it the other way around.
__________________
Few people deserve to be compared to (Rush) Limbaugh, most of them were convicted at the Nuremburg trials. --WilliamSChips on Slashdot |
|
|
|
|
|
#3 |
|
Hobbyist Programmer
|
Many scripting based languages have interpreters available that you can embed into C or C++ programs if that's what you want (languages such as Ruby, Perl, Lua, etc.)
|
|
|
|
|
|
#4 |
|
Programmer
Join Date: Jun 2006
Location: New York
Posts: 43
Rep Power: 0
![]() |
you thought about using jython?
__________________
It's not complex if you know what you're doing... |
|
|
|
|
|
#5 | |
|
Hobbyist Programmer
Join Date: Jan 2006
Location: Menidi, Athens, Greece
Posts: 234
Rep Power: 3
![]() |
Quote:
But I should note that in the final release produced, in the resulting package (apps in OS X are folders containing the executable(s) and their resources such as images and sounds) the 2 programming languages are in separate files from one another and are never mixed up. I am saying all this because I think it's not difficult to create an application which uses 2 programming languages, the 2 must not be totally mixed up with no reason, otherwise, some untraceable bugs may occur or even spaghetti code.
__________________
Project::Soulstorm (personal homepage) |
|
|
|
|
|
|
#6 |
|
Hobbyist Programmer
Join Date: Jun 2006
Location: Ireland
Posts: 152
Rep Power: 3
![]() |
If you want to combine C++ and Python then you're in luck as there is already a comprehensive library written to achieve this; Boost.Python
.
__________________
Visit my website BinaryNotions. |
|
|
|
|
|
#7 |
|
Expert Programmer
|
Uh, depends on what you are trying to do... so you mean have a single source file with source code for two different languages, say like PHP and JavaScript (one embedded in the other).
Or are we talking about creating libraries in one language and linking them to an application that has been created in another language? Most compilers will allow you to link C libraries dynamically, however support beyond that depends strictly on the language/compiler.
__________________
Clifford Matthew Roche <geek@cliffordroche.com> Web Hosting: http://www.crd-hosting.com Consulting: http://www.crdev-consulting.com |
|
|
|
|
|
#8 |
|
Resident Grouch
![]() ![]() ![]() ![]() ![]() ![]() Join Date: Jun 2005
Posts: 6,453
Rep Power: 10
![]() |
Here's a low-level look at the question:
MIXED-LANGUAGE PROGRAMMING Sometimes one needs to call library procedures which were written and compiled using a different language. Compiler writers, like the rest of us, have their own ideas about the best way to do things. Consequently, the interface to the library code may not match up with the interface that another language expects to be in force. It's somewhat like trying to plug a 110V appliance into a 220V outlet. If you force the fit, sparks are likely to fly. The information presented here can't possibly be exhaustive. I hope it serves its purpose, and that there are no egregious errors. Consider the following code snippet:
int add (int m, int n)
{
return m + n;
}
int main (void)
{
int answer;
int x = 2;
int y = 3;
answer = add (x, y);
}Perhaps the 'add' function has been written as a Fortran (or whatever) subroutine and exists in an object or library module available to you for use. When you have code like this, how does the called procedure know what values to add? How does the calling procedure know how (or where) to get the result? The answer is, "The compiler writer makes it happen." If you consider assembly language, the native language of the machine, the mechanisms are up to the person writing the code. You may not know assembly language, but bear with me, the thrust is reasonably understandable. The machine has a number of highly localized variables called 'registers.' The assembly language coder can merely say, 1. Load register A with the value, 2. 2. Load register B with the value, 3. 3. Add register A and register B. (In this example, register A is the implicit accumulator, receiving the result, and in the process, destroying its original content. Perhaps you should have saved it in another place.) 4. Save the result (register A) to variable ANSWER (memory location 0x0040b3af). The high-level-language user doesn't typically have access to the registers. Variables are located in memory and the compiler generates machine code to move them into and out of registers as necessary, performing the required operations in the process. For the 'add' example, the compiler will generate instructions to move variable m into a register, add the variable n (or perhaps even move variable n into a register before adding), and move the result from the accumulator into the receiving variable. Since you may call 'add' repeatedly with different variables, not just 'immediate' values stored in the instruction, or 'global' variables always located at the same memory address, where does 'add' get its values? In many implementations, off the stack. (Note that the variables in the call need not match the variables in the function by name. In practice the caller puts a value from a variable of some name in the appropriate place and the function retrieves it from there. The name is strictly a mnemonic for the memory address of the location to be used.) Different programmers write different code. Different languages generate different machine code for identical or similar operations. Different microprocessors operate differently. Nevertheless, effective methods generally turn out to be highly similar in how they perform their functions. The same is true of moving information back and forth between code calling a function and the code representing the function. Consider the requirements. There must be a common ground used for placing data to be operated on. All participants must know where the common ground is. The data must be placed in positions within the common ground such that all participants are cognizant of the precise location. The type and form of the data must be known to all participants. TYPE AND FORM Consider the last requirements first. The type of the data must be known. The integral value 1 is identical to the ASCII character value, SOH. Yet it is rarely a goal to add 'characters.' Types, therefore, are generally important. Bear in mind that some languages are more stringent than others regarding how specific the information regarding type must be. The compiler writer may have imbued the compiler with more or less ability to infer proper operation from syntax or context. The form of the data must certainly be known. This boils down to those two terms that confuse the bejaysus out of a lot of people: BY VALUE and BY REFERENCE. In this discussion, please do not assume the meaning of REFERENCE to be the specific meaning given to it by C++ (despite its widespread usage in other forms, even within C++). Let us say that you are my calculator and that we have agreed on the rules under which you will operate (and that you will do so correctly). Our rules are that I will give you two pieces of paper, each of which contains a number, that you will sum them and return to me a piece of paper with the result. I then hand you one piece of paper with '2' written on it and another with '3' written on it. You perform appropriately and return to me a piece of paper with the number, '5' written on it. We have agreed to pass the arguments by VALUE. Lets change the rules. I will still give you two pieces of paper with numbers on them. You, however, will interpret those as page numbers, refer to your notebook, and add the numbers found on those pages. I pass you '2' and '3' as before. Page two of your notebook contains the number, '10', and page three contains the number, '7.' You return to me a piece of paper with the number, '17.' We have passed the arguments by REFERENCE. It is clear that if we are not operating on the same wavelength, by the same rules, the outcome is nonsense, unusable. In C (and mostly in C++), arguments are passed by value. Passing by reference is accomplished by passing the VALUE of a reference object. The function definition (char *p, say) makes this sneaky move clear to the function so that it knows to DEREFERENCE the VALUE when it uses it. C++ has an entity called (perhaps unfortunately) a REFERENCE. This thangy is actually an alias for another object. Just another mnemonic by which to refer to the object. The effect of passing this type of 'reference' is to make the original object known within the scope of the procedure, much as a global object would be known. One operates directly with it with no need for dereferencing. This is true 'pass by reference.' THE COMMON GROUND Consider now the common ground, the area of shared information. It is typically a stack. It is, moreover, typically THE stack; the very same mechanism that the machine uses to keep track of where it was when it 'called' a function, or an interrupt occurred, or whatever. This last function is an implicit mechanism in the operation of the machine and is not (normally, anyway) subject to your control, only your influence. Because it is a highly sensitive place to be messing with or corrupting, it is probably not a really good choice for a place for you to put buffers that you subsequently overflow (thus possibly destroying the place-marker for the CPU, as well as important values it has saved temporarily to free up some registers), but there you have it. The benefit, automatically recyclable memory in the short-term, seems to outweigh the drawbacks in most minds (including mine, but I try to be careful). Most languages are arranged so that they push function arguments onto the stack, and may also push a place for the return (some returns will be directly in a machine register), and then call the function. The call causes the current location of the instruction pointer (where the code is that the machine is executing) to be stored on the stack. Other important machine state may also be saved. Interrupts usually cause more information to be saved than function calls. The instruction pointer is then set to the location where the function code (or interrupt service routine) lives. That makes execution proceed with the function's code. When the function terminates, a 'return' instruction is executed. Any stack usage by the function has been 'unwound' and the stack is back in the state it was when the function was called. If the function hasn't corrupted the stack, the stack pointer will be pointing to the address from which the function was called, that address will be placed into the instruction pointer and execution will resume from whence it left off. A simple diagrammed illustration of a generic but typical stack operation is included at the end of this post. We're back from seeing the sights from the home of the function. Lo and behold, the parameters that were pushed onto the stack BEFORE the call (branch) to the function are (barring specific actions) still there! Useless as mammary glands on a boar hog, but still there. The compiler will have generated instructions, though, to remove them before proceeding. If the return value happened to be passed back on the stack, it too will be disposed of. This first method is called, CALLER CLEANS THE STACK. The stack is now in the pristine condition it possessed prior to the call, and yet, miraculously, a function has been performed and its work is irretrievably history. Caller cleaning is the easiest cleaning to perform unless you can get some geek to build your processor with more complicated hardware and logic and expand on the instruction set. The method particularly makes sense if the caller is allowed to pass a variable number of arguments. Who knows better than he how many to pop off? The other method is CALLEE CLEANS THE STACK. Bear in mind that when the callee (called function) is looking at the stack, the parameters are ON THE OTHER SIDE OF THE RETURN ADDRESS. Actually, it is easy to get around this, but you can bet your bottom dollar that some purist high-priest will make you do 137 "Oh hell, Martha!"'s if they catch YOU messing with it instead of the compiler writer. You pop the return address, pop the arguments, and sneak the return address back onto the stack. THEN you call 'return.' The 'x86 and others make it relatively simple. They let the compiler emit machine code for a 'return 8' (or whatever) that tells the CPU (that geek was at work) to return and then dispose of 8 (or whatever amount the arguments required) bytes. DATA POSITION IN THE SHARED AREA Lastly, precisely where are the arguments when the function receives control? If you add 2 and 3 it doesn't matter what order you fetch them in, but a divide is a tad more picky. You have to have agreement as to whether you push the 2 first, then the 3, or the 3 first and then the 2. The most common choices are 'push left to right' and 'push right to left.' I suppose you could devise a method to push odd ones first, from the middle outwards, then even ones from the outside in, but I've not encountered that method, yet. C/C++ pushes from right to left. Other languages push from left to right. Some implementations may differ from others. If you're a gambler, you can guess; perhaps you're not, and would like to inspect the code emitted by the other language. Perhaps you'd just like to research. There are some links at the end of the post. So, then, you want to call a Fortran (or whatever) subroutine from C/C++? Learn how Fortran pushes its arguments, whether it uses BY VALUE or BY REFERENCE, and who cleans the stack. Then write your C function to do it the Fortran way, or write your Fortran subroutine to do it the C way. Don't do both, you'll just be in the same boat but rowing the other direction. A STACK ILLUSTRATION Most microprocessors have an internal pointer (the stack pointer) which references memory so that the micro can keep track of the point of execution as it varies because of interrupts, function calls, and so forth. The stack (memory to which it points) is also used by many systems (sometimes unfortunately) as a storage place for local values, saved registers, and so forth. Just prior to a call, the stack pointer, which is much like any pointer one defines, is pointing to some place in memory (designated by the programmer or the operating system) for its use. When you call a function, it works something like this (its usage varies somewhat from language to language). stack pointer -->| orig position | In most systems, the stack pointer moves toward
| | lower addresses as you use it.
~ ~
At call: | orig position |
| arguments |
stack pointer -->| last argument |
| |
| |
| |
| |
~ ~
There may be zero or more arguments. They are pushed onto the stack in a predetermined
order. For C/C++, it is right-to-left. The stack pointer moves with each push.
After call: | orig position |
| argument here |
| (maybe more)
stack pointer -->| ret addr here |
| |
| |
| |
~ ~
Into procedure | orig position |
Arguments avail | arguments... | When you modify the argument(s), you modify
for use | ret addr here | the value(s) stored here. If an argument
| saved regs, | is a reference or pointer you may use it to
| locals, etc. | modify the value pointed to elsewhere
| in this area | (in the calling procedure, say). If you write
stack pointer -->| | more data to one of the local variables than it
~ ~ can store, guess where the excess winds up.
The function does its work and unwinds the stack (locals, etc.)
Before return | orig position | Immediately before the return, after storage
| arguments... | for saved registers, locals, etc. has already
stack pointer -->| ret addr here | been recovered (and disappeared). The arguments
| | are still on the stack.
| |
| |
~ ~
After return: | orig position | Immediately after the return. The very first
stack pointer -->| arguments... | thing the machine is going to do next is destroy
| | the arguments, whether you've modified them or not.
| | Sometimes the arguments are removed by the called
| | function and the return address position adjusted
| | appropriately.
~ ~
stack pointer -->| orig position | And it's done; you are right back where you started,
| | bookkeeping wise, when you made the call. Any
| | changes you made to the arguments are history, for
~ ~ all practical purposes (they may persist until the
next stack operation).referred to are, of course, in force. If you modified the reference, itself, to point to something else, you could modify that something else, also (for example, subsequent bytes pointed to by a char *). The reference itself disappears. If you pass an argument by value, that value is perfectly usable to the called procedure; it could specify a length to use for some operation, for example. If you modify the value, such modifications disappear when the arguments disappear, immediately after the called procedure returns to the caller. If you want lasting changes in the caller, you need to make them by reference or RETURN a value from the called procedure. LINKS http://www.digitalmars.com/ctg/ctgMixingLanguages.html http://weblogs.asp.net/oldnewthing/a.../02/47184.aspx http://msdn.microsoft.com/library/de...to_fortran.asp
__________________
Abstraction doesn't make it impossible to write bad code; it makes it possible to write superior code. Contributor's Corner: Grumpy on C++ Exceptions DaWei on Pointers |
|
|
|
|
|
#9 |
|
I eat cake for breakfast.
![]() ![]() ![]() ![]() Join Date: Jul 2004
Location: In my box.
Posts: 4,434
Rep Power: 9
![]() |
Did you just write that on the spot, DaWei? If you did, um... holy crap.
To the OP: you might be interested to know that it is possible to compile source files using different languages and link them into the same program using .NET. Of course, this does limit you somewhat. |
|
|
|
|
|
#10 |
|
Newbie
Join Date: Dec 2005
Posts: 4
Rep Power: 0
![]() |
the runtime enviroment(==everything Da Wei said) in other words, is what differs
Build a virtual machine over the hardware, make the neccessary assumptions and you will work it out. anyway good post Da Wei(had to say) |
|
|
|
![]() |
| Bookmarks |
| Currently Active Users Viewing This Thread: 1 (0 members and 1 guests) | |
| Thread Tools | |
| Display Modes | |
|
|